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METHODS, COMPOSITIONS AND KITS 
FOR THE DETECTION AND MONITORING OF LUNG CANCER 

TECHNICAL FIELD OF THE INVENTION 

The present invention relates generally to the field of cancer diagnostics 
More specifically, the present invention relates to methods, compositions and kits for the 
detection of lung cancer in patients with different type, stage and grade of tumors that 
employ oligonucleotide hybridization and/or amplification to simultaneously detect two or 
more tissue-specific polynucleotides in a biological sample suspected of containing lung 
cancer cells. 



10 BACKGROUND OF THE INVENTION 
Field of the Invention 

Lung cancer remains a significant health problem throughout the world. The failure 
of conventional lung cancer treatment regimens can commonly be attributed, in part to 
delayed disease diagnosis. Although significant advances have been made in the area of 
15 lung cancer diagnosis, there still remains a need for improved detection methodologies that 
permit early, reliable and sensitive determination of the presence of lung cancer cells. 

Description of the RplateH Art 

Lung cancer has the highest mortality rate of any of the cancers and is one of the 
20 most difficult to diagnose early. There are an estimated 1 million deaths annually 
worldwide for this disease. According to the American Cancer Society in 2002 alone there 
were an estimated 169,200 new cases diagnosed and - 154,900 deaths. Typically lung 
cancers are classified into two major types: Non-Small Cell Lung Carcinomas (NSCLC) 
comprising squamous, adeno and large cell carcinomas and Small Cell Lung Carcinomas 
25 (SCLC). These groups represent -75% and 25% of all lung tumors respectively with 
adenocarcinoma and squamous cell carcinoma being the most prevalent forms of NSCLC 
with large cell carcinomas being -10%. Within the group of NSCLC, adenocarcinoma is 
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currently the most prominent form of lung cancer in younger persons, women of all ages, 
lifetime nonsmokers and long-term former smokers. SCLC typically fall into two subtypes 
oat cell and intermediate cell. Less common tumors include carcinoid and mesotheliomas 
among others but these represent only a small percentage of all lung tumors. In almost all 
5 cases early diagnosis of NSCLC is elusive and most lung cancers have already 
metastasized by the time they are detected. Only 16.7% are localized on initial diagnosis. 
If tumors can be detected at a point where they are confined then the combination of 
chemotherapy and radiation has a possibility of success but overall the 5year prognosis is 
very poor with only 10-15% survival rate. The picture with SCLC is even bleaker only 6% 
10 localized at initial diagnosis and with 5 year survival rates of -6%. 

X-ray and computer tomography of the chest and abdomen are frequently used in 
diagnosis of lung tumors but lack sensitivity for detecting small foci and usually detect 
tumors that have already metastasized. Sputum cytology as a potential screening method in 
high-risk individuals has only been partially effective and often does not yield tumor type. 
15 To stage the disease CAT scan, MRI or bone scans are used to evaluate the spread of 
disease. Treatment for lung cancer is typically surgical, radiological or chemotherapy or 
combinations thereof, but usually with poor outcome due to the late diagnosis of disease. 

The current tests for lung cancer lack either the clinical sensitivity to detect early 
tumors or provide inadequate stage/grade information or lack tumor specificity due to their 
20 originating from other tumor types or being present in benign lung disorders. There is 
therefore a need to develop specific tests that can improve lung cancer diagnosis and 
prognosis and potentially differentiate between NSCLC and SCLC. The present invention 
achieves these and other related objectives by providing methods that are useful for the 
identification of tissue-specific polynucleotides, in particular tumor-specific 
25 polynucleotides, as well as antibodies and methods, compositions and kits for the detection 
and monitoring of cancer cells in a patient afflicted with the disease. 
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SUMMARY OF THE INVENTION 

l 

The present invention provides methods for deteeting the presence of lung 
cancer celts in a patient. Such methods comprise the steps of: (n) obtaining a biological 
sample (rem me p atient; w contacttag ^ biologjcal ^ ^ ^ ^ ^ 

5 ohgonucleotide pairs specific for independent polynucleotide sequences which are 
unrelated to one another, wherein the oligonucleotide paint hybridize, under moderately 
stnngen, conditions, to their respective polynucleotides and .he complements thereof (c) 
amphfying the polynucleotides; and (d) detecting the amplified polynucleotides; wherein 
the presence of one or more of the amplified polynucleotides indicates me presence of lung 
10 cancer cells in the patient. 

By some embodiments, detection of the amplified polynucleotides may be preceded 
by a fractionation step such as, for example, gel electrophoresis. Alternatively or 
additionally, detection of the amplified polynucleotides may be achieved by hybridization 
of a labeled oligonucleotide probe that hybridizes specifically, under moderately stringent 
15 conditions, to such polynucleotide, Oligonucleotide labeling may be achieved by 
incorporating a radiolabeled nucleotide or by incorporating a fluorescent label. 

In certain preferred embodiments, cells of a specific tissue type may be enriched 
from the biological sample prior to the steps of detection. Enrichment may be achieved by 
a methodology selected from the group consisting of cell capture and cell depletion 
20 Exemplary cell capture methods include immunocapture and comprise the steps of (a) 
adsorbing an antibody to a tissue-specific cell surface to cells said biological sample- (b) 
separating the antibody adsorbed tissue-specific cells from the remainder of the biological 
sample. Exemplary cell depletion may be achieved by cross-linking re d cells and white 
cells followed by a subsequent fractionation step to remove the cross-linked cells xxx 
25 Alternative embodiments of the present invention provide methods for determining 

the presence or absence of lung cancer in a patient, comprising the steps of: (a) contacting a 
biological sample obtained from the patient with two or more oligonucleotides that 
hybndsze to two or more polynucleotides that encode two or more lung tumor proteins- (b) 
detecung in the sample a level of at least one of the polynucleotides (such as, for example, 
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mRNA) that hybridize to the oligonucleotides; and (c) comparing the level of 
polynucleotides that hybridize to the oligonucleotides with a predetermined cut-off value, 
and therefrom determining the presence or absence of lung cancer in the patient. Within 
certain embodiments, the amount of mRNA is detected via polymerase chain reaction 
5 using, for example, at least one oligonucleotide primer that' hybridizes to a polynucleotide 
encoding a polypeptide as recited above, or a complement of such a polynucleotide. 
Within other embodiments, the amount of mRNA is detected using a hybridization 
technique, employing an oligonucleotide probe that hybridizes to a polynucleotide that 
encodes a polypeptide as recited above, or a complement of such a polynucleotide. 

10 In related aspects, methods are provided for monitoring the progression of lung 

cancer in a patient, comprising the steps of: (a) contacting a biological sample obtainpd 
from a patient with two or more oligonucleotides that hybridize to two or more 
polynucleotides that encode lung tumor proteins; (b) detecting in the sample an amount of 
the polynucleotides that hybridize to the oligonucleotides; (c) repeating steps (a) and (b) 

15 using a biological sample obtained from the patient at a subsequent point in time; and (d) 
comparing the amount of polynucleotide detected in step (c) with the amount detected in 
step (b) and therefrom monitoring the progression of the cancer in the patient. 

Certain embodiments of the present invention provide that the step of amplifying 
said first polynucleotide and said second polynucleotide is achieved by the polymerase 

20 chain reaction (PCR). 

The present invention also provides kits that are suitable for performing the 
detection methods of the present invention. Exemplary kits comprise oligonucleotide 
primer pairs each one of which specifically hybridizes to a distinct polynucleotide. Within 
certain embodiments, kits according to the present invention may also comprise a nucleic 

25 acid polymerase and suitable buffer. 

These and other aspects of the present invention will become apparent upon 
reference to the following detailed description and attached drawings. All references 
disclosed herein are hereby incorporated by reference in their entirety as if each was 
incorporated indi vidually . , 
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BRIEF DESCRIPTION OF SEQUENCE IDENTIFIERS 
SEQ ID NO: 1 is the determined cDNA sequence L762P. 

SEQ ID NO: 2 is the amino acid sequence encoded by the sequence of SEQ ID NO: 1. 
SEQ ID NO: 3 is the determined cDNA sequence L984P. 

SEQ ID NO: 4 is the amino acid sequence encoded by the sequence of SEQ ID NO: 3. 
SEQ ID NO: 5 is the determined cDNA sequence L550S. 

SEQ ID NO: 6 is the amino acid sequence encoded by the sequence of SEQ ID NO: 5. 
SEQ ID NO: 7 is the determined cDNA sequence L552S. ! 
SEQ ID NO: 8 is the amino acid sequence encoded by the sequence of SEQ ID NO: 7. 

SEQ ID NO:9 is the DNA sequence of L552S INT forward primer. 
SEQ ID NO: 10 is the DNA sequence of L552S INT reverse primer. 
SEQ ID NO:ll is the DNA sequence of L552S Taqman probe. 
SEQ ID NO: 12 is the DNA sequence of L550S INT forward primer. 
15 SEQ ID NO:13 is the DNA sequence ofL550S INT reverse primer. 
SEQ ID NO:14 is the DNA sequence of L550S Taqman probe. 
SEQ ID NO:15 is the DNA sequence of L726P INT forward primer. 
SEQ ID NO:16 is the DNA sequence of L726P INT reverse primer. 
SEQ ID NO: 17 is the DNA sequence of L726P Taqman probe. 
20 SEQ ID NO:18 is the DNA sequence of L984P INT forward primer. 
SEQ ID NO: 19 is the DNA sequence of L984P INT reverse primer. 

SEQ ID NO:20 is the DNA sequence of L984P Taqman probe. 
SEQ ID NO:21 is the determined cDNA sequence of L763P. 

SEQ ID NO:22 is the amino acid sequence encoded by the sequence of SEQ ID NO:21. 
25 SEQ ID NO:23 is the DNA sequence ofL763P INT forward primer. 
SEQ ID NO:24 is the DNA sequence of L763P reverse primer. 
SEQ ID NO:25 is the DNA sequence of L763P Taqman probe. 
SEQ ID NO:26 is the determined cDNA sequence of L587. 

SEQ ID NO:27 is the amino acid sequence encoded by the sequence of SEQ ID NO:26. 
30 SEQIDNO:28i S theDNAsequenceofL587INTforwardprimer. 
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SEQ ID NO:29 is the DNA sequence of L587 DMT reverse primer. 
SEQ ID NO:30 is the DNA sequence of L587 Taqman probe. 
SEQ ID NO:31 is the determined cDNA sequence of L523. 

SEQ ID NO:32 is the amino acid sequence encoded by the sequence of SEQ ID NO:31. 
5 SEQ ID NO:33 is the DNA sequence of L523 primer. 
SEQ ID NO:34 is the DNA sequence of L523 primer. 

DETAILED DESCRIPTION OF THE INVENTION 

As noted above, the present invention is directed generally to methods that are 
suitable for the identification of tissue-specific polynucleotides as well as to methods, 
10 compositions and kits that are suitable for the diagnosis and monitoring of lung cancer, in 
particular such methods, compositions and kits are suitable for use in the diagnosis, 
differentiation and/or prognosis of NSCLC and SCLC. Such diagnostic methods will form 
the basis for a molecular diagnostic test for detecting lung cancer metastases in lung tissue 
and for the detection of anchorage independent lung cancer cells in blood as well as in 
15 mediastinal lymph nodes of distant metastases. 

A variety of genes have been identified as over-expressed in lung tumors, in 
particular squamous or adeno forms of NSCLC or small cell carcinomas. These include, 
but are not limited to: L762P, L984P, L550S/L548S, L552S/L547S, L552/L547S, L20OT, 
L514S, L551S, L587S, L763S, L773P, L801P. L985P, L1058C, L1081C, L523S,' 
20 OF1783P, B307D (WIPO International Patent Application Nos: WO 99/47674, published 
September 23, 1999; WO 00/61612, published October 19, 2000; WO 02/00174, published 
January 3, 2002; WO 02/47534, published June 20, 2002; WO 01/72295, published 
October 4, 2001; WO 02/092001, published November 21, 2002; WO 01/00828, published 
January 1, 2001; WO 02/04514, published January 17, 2002; WO 01/92525, published 
25 December 6, 2002; WO 02/02623, published January 10. 2002. US Patent Nos: Wang et 
al., 6,482,597, issued November 22, 2002; Wang et al., 6,518,256, issued February 11, 
2003; Wang et al., 6,426,072, issued July 30, 2002; Reed et al., 6,210,883, issued April 3,' 1 
2001; Wang et al., 6,504,010, issued January 7, 2003; Wang et al., 6,509,448, issued' 
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J*»u*y 21. 2003. Wang e, al; 0n cc S ene; 2 i(49):75 S »8-604. 2002 (coHagen type XI alpha 

These genes were identified and characterized using PGR and cDNA library 
subtractions as well as electronic subtractions with each of the tumor types individually. 
The cDNAs identified wen, then evaluated by microarray then by Real Time PGR on tissue 
panels to identify specific expression patterns. Table 1 highlights the specificity of these 
genes for either adeno or squamous forms of NSCLC or both as well as genes specific for 
small cell lung carcinoma, m some cases reactivity with large cell carcinomas has also 
been identified by Real Time PCR analysis. 



Gene 



L514S 



L763P 
L773P 



Squamous 



Table 1 



Adeno 



L801P 



L978P 



L985P 



L1058C 



L1081C 
L523S 
OF 1783P 



B307D 



Small cell 




Large cell I Normal 



Identification of Tioo„ o_ SDecific p^y nusjgQtggg 

Certain embodiments of the present invention provide methods, compositions and 
15 fats for the detection of lung cancer cells within a biological sample from patients with 
different type, stage and grade of tumor, These methods comprise the step of detecting 
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one or more tissue-specific polynucleotide^) from a patient's biological sample the over 
expression of which polynucleotides, indicates the presence of lung cancer cells within the 
paint's biological sample. Accordingly, the present invention also provides methods that 
are suitable for the identification of tissue-specific polynucleotides. As used herein the 
5 phrases "tissue-specific polynucleotides" or "tumor-specific polynucleotides" are meant to 
include all polynucleotides that are at least two-fold over-expressed as compared to one or 
more control tissues. As discussed in further detail herein below, over-expression of a 
given polynucleotide may be assessed, for example, by microarray and/or quantitative real- 
fame polymerase chain reaction (Real-time PCR™) methodologies. 
10 Exemplary methods for detecting tissue-specific polynucleotides may comprise the 

steps of: (a) performing a genetic subtraction to identify a pool of polynucleotides from a 
fassue of interest; (b) performing a DNA microarray analysis to identify a fin* subset of 
saui pool of polynucleotides of interest wherein each member polynucleotide of said first 
subset is at least two-fold over-expressed in said tissue of interest as compared to a control 
15 fassue; and (c) performing a quantitative polymerase chain reaction analysis on 
polynucleotides within said first subset to identify a second subset of polynucleotides that 
are at least two-fold over-expressed as compared to said control tissue. 
PobmuelentMe* Qeneralhj, 

As used herein, the term "polynucleotide" refers generally to either DNA or RNA 
20 molecules. Polynucleotides may be naturally occurring as normally found in a biological 
sample such as blood, serum, lymph node, bone marrow, sputum, urine and tumor biopsy 
samples. Alternatively, polynucleotides may be derived synthetically by, for example a 
nucleic acid polymerization reaction. As will be recognized by the skilled artisan 
polynucleotides may be single-stranded (coding or antisense) or double-stranded, and may 
25 be DNA (genomic, cDNA or synthetic) or RNA molecules. RNA molecules include 
HnRNA molecules, which contain introns and correspond to a DNA molecule in a one-to- 
one manner, and mRNA molecules, which do not contain introns. Additional coding or 
non-coding sequences may, but need not, be present within a polynucleotide of the present 
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invention, and a polynucleotide may, but need not, be linked to other molecules and/or 
support materials. 

Polynucleotides may comprise a native sequence (i.e. an endogenous sequence that 
encodes a tumor protein, such as a lung tumor protein, or a portion thereof) or may 
5 comprise a variant, or a biological or antigenic functional equivalent of such a sequence. 
Polynucleotide variants may contain one or more substitutions, additions, deletions and/or 
insertions, as further described below. The term "variants" also encompasses homologous 
genes of xenogenic origin. 

When comparing polynucleotide or polypeptide sequences, two sequences are said 
10 to be "identical" if the sequence of nucleotides or amino acids in the two sequences is the 
same when aligned for maximum correspondence, as described below. Comparisons 
between two sequences am typically performed by comparing the sequences over a 
comparison window to identify and compare local regions of sequence similarity. A 
"comparison window" as used herein, refers to a segment of at least about 20 contiguous 
15 positions, usually 30 to about 75, 40 to about 50, in which a sequence may be compared to 
a reference sequence of the same number of contiguous positions after the two sequences 
are optimally aligned. 

. s Optimal alignment of sequences for comparison may be conducted using the 
Megalign program in the Lasergene suite of bioinformatics software (DNASTAR, Inc 

20 Madison, WD, using default parameters. This program embodies several alignment 
schemes described in the following references: Dayhoff, M.O. (1978) A model of 
evolutionary change in proteins - Matrices for detecting distant relationships. In Dayhoff, 
M.O. (ed.) Atlas of Protein Sequence and Structure, National Biomedical Research 
Foundation, Washington DC Vol. 5, Suppl. 3, pp. 345-358; Hein J. (1990) Unified 

25 Approach to Alignment and Phylogenes pp. 626-645 Methods in Enzymology vol. 183, 
Academic Press, Inc., San Diego, CA; Higgins, D.G. and Sharp, P.M. (1989) CABIOS 
5:151-153; Myers, E.W. and Muller W. (1988) CABIOS 4:11-17; Robinson, EX». (1971) 
Comb. Theor i7:105; Santou, N. Nes, M. (1987) Mol. Biol. Evol. 4:406-425; Sneath, 
PH.A. and Sokal, R.R. (1973) Numerical Taxonomy - the Principles and Practice of 
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Numerical Taxonomy, Freeman Press, San Francisco/ CA; Wilbur, WJ. and Lipman, DJ 
(1983) Proc. Natl. Acad., ScL USA $0:726-730. 

Alternatively, optimal alignment of sequences for comparison may be conducted by 
the local identity algorithm of Smith and Waterman (1981) Add. APL Math 2:482 by the 
■denuty alignment algorithm of Needleman and Wunsch (1970) J. Mol Biol. 48:443 by the 
search for similarity methods of Pearson and Lipman (1988) Proc. Natl. Acad. Sci. USA 85: 
2444, by computerized implementations of these algorithms (GAP, BESTFTT BLAST 
FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer 
Group (GCG), 575 Science Dr., Madison, WI), or by inspection. 

One preferred example of algorithms that are suitable for determining percent 
sequence identity and sequence similarity are the BLAST and BLAST 2.0 algorithms 
which are described in Altschul et al. (1977) Nucl. Acids Res. 25:3389-3402 and Altschui 
et al. (1990) /. Mol. Biol. 215:403-410, respectively. BLAST and BLAST 2.0 can be used 
for example with the parameters described herein, to determine percent sequence identity' 
15 for the polynucleotides and polypeptides of the invention. Software for performing 
BLAST analyses is publicly available through the National Center for Biotechnology 
Information. In one illustrative example, cumulative scores can be calculated using for 
nucleotide sequences, the parameters M (reward score for a pair of matching residues- 
always >0) and N (penalty score for mismatching residues; always <0). For amino acid 
20 sequences, a scoring matrix can be used to calculate the cumulative score. Extension of the 
word hits in each direction are halted when: the cumulative alignment score falls off by the 
quantity X from its maximum achieved value; the cumulative score goes to zero or below 
due to the accumulation of one or more negative-scoring residue alignments; or the end of 
ether sequence is reached The BLAST algorithm parameters W, T and X determine the 
25 sensitivity and speed of the alignment. The BLASTN program (for nucleotide sequences) 
uses as defaults a wordlength (W) of 11, and expectation (E) of 10, and the BLOSUM62 
scoring matrix (see Henikoff and Henikoff (1989) Proc. Natl. Acad. Sci. USA 8910915) 
alignments, (B) of 50, expectation (E) of 10, M=5, N=-4 and a comparison of both strands 
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Preferably, the "percentage of sequence identity" is determined by comparing two 
optimally aligned sequences over a window of comparison of at least 20 positions, wherein 
the portion of the polynucleotide or polypeptide sequence in the comparison window may 
comprise additions or deletions (i.e., gaps) of 20 percent or less, usually 5 to 15 percent, or 
5 10 to 12 percent, as compared . to the reference sequences (which does not comprise 
additions or deletions) for optimal alignment of the two sequences. The percentage is 
calculated by determining the number of positions at which the identical nucleic acid bases 
or amino acid residue occurs in both sequences to yield the number of matched positions, 
dividing the number of matched positions by the total number of positions in the reference 
10 sequence (le., the window size) and multiplying the results by 100 to yield the percentage 
of sequence identity. 

Therefore, the present invention encompasses polynucleotide and polypeptide 
sequences having substantial identity to the sequences disclosed herein, for example those 
comprising at least 50% sequence identity, preferably at least 55%, 60%, 65%, 70%, 75%, 
80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% or higher, sequence identity compared to a 
polynucleotide or polypeptide sequence of this invention using the methods described 
herein, (e.g., BLAST analysis using standard parameters, as described below). One skilled 
in this art will recognize that these values can be appropriately adjusted to determine 
corresponding identity of proteins encoded by two nucleotide sequences by taking into 
20 account codon degeneracy, amino acid similarity, reading frame positioning and the like. 

In additional embodiments, the present invention provides isolated polynucleotides 
and polypeptides comprising various lengths of contiguous stretches of sequence identical 
to or complementary to one or more of the sequences disclosed herein. For example,' 
polynucleotides are provided by this invention that comprise at least about 15, 20, 30, 40, 
50, 75, 100, 150, 200, 300, 400, 500 or 1000 or more contiguous nucleotides of one or 
more of the sequences disclosed herein as well as all intermediate lengths there between. It 
will be readily understood that "intermediate lengths", in this context, means any length 
between the quoted values, such as 16, 17, 18, 19, etc.; 21, 22, 23, etc. ; 30, 31, 32, etc.; 50, 
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51, 52, 53, etc.; 100, 101, 102, 103, etc.; 150, 151, 152, 153, etc.; including all integers 
through 200-500; 500-1 ,000, and the like. 

The polynucleotides of the present invention, or fragments thereof, regardless of the 
length of the coding sequence itself, may be combined with other DNA sequences, such as 
5 promoters, polyadenylation signals, additional restriction enzyme sites, multiple cloning 
sites, other coding segments, and the like, such that their overall length may vary 
considerably. It is therefore contemplated that a nucleic acid fragment of almost any length 
may be employed, with the total length preferably being limited by the ease of preparation 
and use in the intended recombinant DNA protocol. For example, illustrative DNA 
10 segments with total lengths of about 10,000, about 5000, about 3000, about 2,000, about 
1,000, about 500, about 200, about 100, about 50 base pairs in length, and the like, 
(including all intermediate lengths) are contemplated to be useful in many implementations 
of this invention. 

In other embodiments, the present invention is directed to polynucleotides that are 
capable of hybridizing under moderately stringent conditions to a polynucleotide sequence 
provided herein, or a fragment thereof, or a complementary sequence thereof. 
Hybridization techniques are well known in the art of molecular biology. For purposes of 
illustration, suitable moderately stringent conditions for testing the hybridization of a 
polynucleotide of this invention with other polynucleotides include prewashing in a 
solution of 5 X SSC, 0.5% SDS, 1.0 mM EDTA (pH 8.0); hybridizing at 50*C-65°C, 5 X 
SSC, overnight; followed by washing twice at 65°C for 20 minutes with each of 2X, 0.5X 
and 0.2X SSC containing 0. 1 % SDS. 

Moreover, it will be appreciated by those of ordinary skill in the art that, as a result 
of the degeneracy of the genetic code, there are many nucleotide sequences that encode a 
polypeptide as described herein. Some of these polynucleotides bear minimal homology to 
the nucleotide sequence of any native gene. Nonetheless, polynucleotides that vary due to 
differences in codon usage are specifically contemplated by the present invention. Further, 
alleles of the genes comprising the polynucleotide sequences provided herein are within the 
scope of the present invention. Alleles are endogenous genes that are altered as a result of 
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one or more mutations, such as deletions, additions and/or substitutions of nucleotides. 
The resulting mRNA and protein may, but need not, have an altered structure or function. 
Alleles may be identified using standard techniques (such as hybridization, amplification 
and/or database sequence comparison). 
5 Microarray Analyses 

Polynucleotides that are suitable for detection according to the methods of the 
present invention may be identified, as described in more detail below, by screening a 
microarray of cDNAs for tissue and/or tumor-associated expression {e.g., expression that is 
at least two-fold greater in a tumor than in normal tissue, as determined using a 
10 representative assay provided herein). Such screens may be performed, for example, using 
a Synteni microarray (Palo Alto, CA) according to the manufacturer's instructions (and 
essentially as described by Schena et aL, Proc. Natl. Acad. ScL USA 93:10614-10619 
(1996) and Heller et al., Proc. Natl. Acad. Sci. USA 94:2150-2155 (1997)). 

Microarray is an effective method for evaluating large numbers of genes but due to 
15 its limited sensitivity it may not accurately determine the absolute tissue distribution of low 
abundance genes or may underestimate the degree of overexpression of more abundant 
genes due to signal saturation. For those genes showing overexpression by microarray 
expression profiling, further analysis was performed using quantitative RT-PCR based on 
Taqman™ probe detection, which comprises a greater dynamic range of sensitivity. Several 
20 different panels of normal and tumor tissues, distant metastases and cell lines were used for 
this purpose. 

Quantitative Real-time Poly merase Chnin Reaction 

Suitable polynucleotides according to the present invention may be further 
characterized or, alternatively, originally identified by employing a quantitative PCR 
25 methodology such as, for example, the Real-time PCR methodology. By this methodology, 
tissue and/or tumor samples, such as, e.g., metastatic tumor samples, may be tested along 
side,, the corresponding normal tissue sample and/or a panel of unrelated normal tissue 
samples. 
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Real-time PCR (see Gibson et al., Genome Research <J:995-1001, 1996; Heid et al 
Genome Research 5:986-994, 1996) is a technique that evaluates the level of PCR product 
accumulation during amplification. This technique permits quantitative evaluation of 
mRNA levels in multiple samples. Briefly, mRNA is extracted from tumor and normal 
tissue and cDNA is prepared using standard techniques. 

Real-time PCR may, for example, be performed either on the ABI 7700 Prism or 
on a GeneAmp® 5700 sequence detection system (Applied Biosystems, Foster City, CA). 
The 7700 system uses a forward and a reverse primer in combination with a specific probe 
with a 5' fluorescent reporter dye at one end and a 3' quencher dye at the other end 
Craqman™). when the Real-time PCR is performed using Taq DNA polymerase with 5'- 
3' nuclease activity, the probe is cleaved and begins to fluoresce allowing the reaction to be 
monitored by the increase in fluorescence (Real-time). The 5700 system uses SYBR® 
green, a fluorescent dye, that only binds, to double stranded DNA, and the same forward 
and reverse primers as the 7700 instrument. Matching primers and fluorescent probes may 
be designed according to the primer express program (Applied Biosystems, Foster City, 
CA). Optimal concentrations of primers and probes are initially determined by those of 
ordinary skill in the art. Control (e.g., 0-actin) primers and probes may be obtained 
commercially from, for example, Perkin Elmer/Applied Biosystems (Foster City, CA). 

To quantitate the amount of specific RNA in a sample, a standard curve is 
generated using a plasmid containing the gene of interest. Standard curves are generated 
using the Ct values determined in the real-time PCR, which are related to the initial cDNA 
concentration used in the assay. Standard dilutions ranging from 10-10* copies of the gene 
of interest are generally sufficient. In addition, a standard curve is generated for the control 
sequence. This permits standardization of initial RNA content of a tissue sample to the 
25 amount of control for comparison purposes. 

In accordance with the above, and as described further below, the present invention 
provides the illustrative lung tissue- and/or tumor-specific polynucleotides L552S, L550S, 
L762P, L984P, L763P and L587 having sequences set forth in SEQ ID NOs: 1, 3, 5, 7, 21 
and 26, illustrative polypeptides encoded thereby having amino acid sequences set fortn in 
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SEQ ID NO: 2, 4, 6, 8, 22 and 27 that may be suitably employed in the detection of cancer, 
more specifically, lung cancer. 

Methodologie s for the Detection of Cancer 
5 In general, a cancer cell may be detected in a patient based on the presence of one 

or more polynucleotides within cells of a biological sample (for example, blood, lymph 
nodes, bone marrow, sera, sputum, urine and/or tumor biopsies) obtained from the patient. 
In other words, such polynucleotides may be used as markers to indicate the presence or 
absence of a cancer such as, e.g., lung cancer. 
10 As discussed in further detail herein, the present invention achieves these and other 

related objectives by providing a methodology for the simultaneous detection of more than 
one polynucleotide, the presence of which is diagnostic of the presence of lung cancer cells 
in a biological sample. Each of the various cancer detection methodologies disclosed 
herein have in common a step of hybridizing one or more oligonucleotide primers and/or 
15 probes, the hybridization of which is demonstrative of the presence of a tumor- and/or 
tissue-specific polynucleotide. Depending on the precise application contemplated, it may 
be preferred to employ one or more intron-spanning oligonucleotides that are inoperative 
against polynucleotide of genomic DNA and, thus, these oligonucleotides are effective in 
substantially reducing and/or eliminating the detection of genomic DNA in the biological 
20 sample. 

Further disclosed herein are methods for enhancing the sensitivity of these detection 
methodologies by subjecting the biological samples to be tested to one or more cell capture 
and/or cell depletion methodologies. 

By certain embodiments of the present invention, the presence of lung cancer cell in 
25 a patient may be determined by employing the following steps: (a) contacting a biological 
sample obtained from the patient with two or more oligonucleotides that hybridize to two 
or more polynucleotides that encode two or more lung tumor proteins as described herein; 
(b) detecting in the sample a level of at least one of the polynucleotides (such as, for 
example, mRNA) that hybridize to the oligonucleotides; and (c) comparing the level of 
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polynucleotides that hybridize to the oligonucleotides with a predetermined cut-off value, 
and therefrom determining the presence or absence of lung cancer in the patient. 

To permit hybridization under assay conditions, oligonucleotide primers and probes 
should comprise an oligonucleotide sequence that has at least about 60%, preferably at 
5 least about 75% and more preferably at least about 90%, identity to a portion of a 
polynucleotide encoding a lung tumor protein that is at least 10 nucleotides, and preferably 
at least 20 nucleotides, in length. Preferably, oligonucleotide primers hybridize to a 
polynucleotide encoding a polypeptide described herein under moderately stringent 
conditions, as defined above. Oligonucleotide primers which may be usefully employed in 

10 the diagnostic methods described herein preferably are at least 10-40 nucleotides in length. 
In a preferred embodiment, the oligonucleotide primers comprise at least 10 contiguous 
nucleotides, more preferably at least 15 contiguous nucleotides, of a DNA molecule having 
a sequence recited in SEQ ID NO: 1, 3, 5 or 7. Techniques for both PCR based assays and 
hybridization assays are well known in the art {see, for example, Mullis et al, Cold Spring 

15 Harbor Symp. Quant. Biol., 57:263, 1987; Erlich ed., PCR Technology, Stockton Press, 
NY, 1989). 

The present invention also provides amplification-based methods for detecting the 
presence of lung cancer cells in a patient. Exemplary methods comprise the steps of (a) 
obtaining a biological sample from the patient; (b) contacting the biological sample with 

20 two or more oligonucleotide pairs specific for independent polynucleotide sequences which 
are unrelated to one another, wherein the oligonucleotide pairs hybridize, under moderately 
stringent conditions, to their respective polynucleotides and the complements thereof (c) 
amplifying the polynucleotides; and (d) detecting the amplified polynucleotides; wherein 
the presence of one or more of the amplified polynucleotides indicates the presence of lung 

25 cancer cells in the patient. 

Methods according to the present invention are suitable for identifying 
polynucleotides obtained from a wide variety of biological sample such as, for example, 
blood, serum, lymph node, bone marrow, sputum, urine and tumor biopsy sample, among 
others. 

16 
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Certain exemplary embodiments of the present invention provide methods wherein 
the polynucleotides to be detected are selected from the group consisting of L762, L984, 
L550, L552, L763 and L587. Alternatively and/or additionally, polynucleotides to be 
detected may be selected from the group consisting of those depicted in SEQ ID NOs: 1, 3, 
5 5, 7, 21 and 26. 

Suitable exemplary oligonucleotide probes and/or primers mat may be used 
according to the methods of the present invention are disclosed herein. In certain preferred 
embodiments that eliminate the background detection of genomic DNA, the 
oligonucleotides may be intron spanning oligonucleotides. 
10 Depending on the precise application contemplated, the artisan may prefer to detect 

the tissue- and/or tumor-specific polynucleotides by detecting a radiolabel and detecting a 
fluorophore. More specifically, the oligonucleotide probe and/or primer may comprises a 
detectable moiety such as, for example, a radiolabel and/or a fluorophore. 

Alternatively or additionally, methods of the present invention may also comprise a 
15 step of fractionation prior to detection of the tissue- and/or tumor-specific polynucleotides 
such as, for example, by gel electrophoresis. ' 

In other embodiments, methods described herein may be used as to monitor the 
progression of cancer. By these embodiments, assays as provided for the diagnosis of lung 
cancer may be performed over time, and the change in the level of reactive polypeptide(s) 
20 or polynucleotide(s) evaluated. For example, the assays may be performed every 24-72 
hours for a period of 6 months to 1 year, and thereafter performed as needed. In general, a 
cancer is progressing in those patients in whom the level of polypeptide or polynucleotide 
detected increases over time. In contrast, the cancer is not progressing when the level of 
reactive polypeptide or polynucleotide either remains constant or decreases with time. 
25 Certain in vivo diagnostic assays may be performed directly on a tumor. One such 

assay involves contacting tumor cells with a binding agent. The bound binding agent may 
then be detected directly or indirectly via a reporter group. Such binding agents may also 
be used in histological applications. Alternatively, polynucleotide probes may be used 
within such applications. 
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As noted above, to improve sensitivity, multiple lung tumor protein markers may be 
assayed within a given sample. It will be apparent that binding agents specific for different 
proteins provided herein may be combined within a single assay. Further, multiple primers 
or probes may be used concurrently. The selection of tumor protein markers may be based 
on routine experiments to determine combinations that results in optimal sensitivity. In 
addition, or alternatively, assays for tumor proteins provided herein may be combined with 
assays for other known tumor antigens. 
Cell Enrichment 

In other aspects of the present invention, cell capture technologies may be used 
prior to polynucleotide detection to improve the sensitivity of the various detection 
methodologies disclosed herein. 

Exemplary cell enrichment methodologies employ immunomagnetic beads that are 
coated with specific monoclonal antibodies to surface cell markers, or tetrameric antibody 
complexes, may be used to first enrich or positively select cancer cells in a sample. 
Various commercially available kits may be used, including Dynabeads® Epithelial Enrich 
(Dynal Biotech, Oslo, Norway), StemSep™ (StemCell Technologies, Inc., Vancouver, 
BC), and RosetteSep (StemCell Technologies). The skilled artisan will recognize that 
other readily available methodologies and kits may also be suitably employed to enrich or 
positively select desired cell populations. 

Dynabeads® Epithelial Enrich contains magnetic beads coated with mAbs specific 
for two glycoprotein membrane antigens expressed on normal and neoplastic epithelial 
tissues. The coated beads may be added to a sample and the sample then applied to a 
magnet, thereby capturing the cells bound to the beads. The unwanted cells are washed 
away and the magnetically isolated cells eluted from the beads and used in further analyses. 

RosetteSep can be used to enrich cells directly, from a blood sample and consists of 
a cocktail of tetrameric antibodies that target a variety of unwanted cells and crosslinks 
them to glycophorin A on red blood cells (RBC) present in the sample, forming rosettes. 
When centrifuged over Ficoll, targeted cells pellet along with the free RBC. 
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The combination of antibodies in the depletion cocktail determines which cells will 
be removed and consequently which cells will be recovered. Antibodies that are available 
include, but are not limited to: CD2, CD3, CD4, CD5, CD8, CD10, CDllb, CD14, CD15, 
CD16, CD19, CD20, CD24, CD25, CD29, CD33, CD34, CD36, CD38, CD41, CD45, 
5 CD45RA, CD45RO, CD56, CD66B, CD66e, HLA-DR, IgE, and TCRap. Additionally, it 
is contemplated in the present invention that mAbs specific for lung tumor antigens, can be 
developed and used in a similar manner. For example, mAbs that bind to tumor-specific 
cell surface antigens may be conjugated to magnetic beads, or formulated in a tetrameric 
antibody complex, and used to enrich or positively select metastatic lung tumor cells from 
10 a sample. Such a system can be used to evaluate blood samples from different forms of 
lung cancers, in particular adneo and squamous forms of NSCLC and small cell carcinomas 
for the presence of circulating tumor cells using the inventive multiplex PCR assay as 
described herein. 

Once a sample is enriched or positively selected, cells may be further analyzed. For 
15 example, the cells may be lysed and RNA isolated. RNA may then be subjected to RT- 
PCR analysis using lung tumor-specific multiplex primers in a Real-time PCR assay as 
described herein. 

In another aspect of the present invention, cell capture technologies may be used in 
conjunction with Real-Time PCR to provide a more sensitive tool for detection of 

20 metastatic cells expressing lung tumor antigens. 

Yet another method that may be employed is an anti-ganglioside Gmi/G M i cell 
capture antibody system. Gangliosides are cell membrane bound glycosphingolipids, 
several species of which have been shown to be over-expressed on the cell surface of most 
cancers of neuroectodermal and epithelial origin, in particular lung cancer. Cell surface 

25 expression of Gm2 is seen in several types of lung cancer, particularly in SCLC which make 
it an attractive target for a monoclonal antibody based lung cancer immunotherapy and also 
for use as a capture method in conjunction with G M i- 
Probes and Primers 
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As noted above and as described in (teher detai! herein, certain methods 
compositions and kits according to the present jnvenMon mo „ more 

oligonuCeodde primer pah* for the detecUOn of ,ung cancer. The ability of such nucleic 
actd probes t o specifically hybridize to a sequence of interest will Ma ble mem » be „, use 
detecttng the presence of complement^ sequences in a biological sample 

Akemadvely. in other embodiments, me probes and/or primers of me present 
mvendon may be employed for detection via nucleic acid hybridization. As such, i, is 
contempt ma. nucleic acid segments tha, comprise a sequence region of a. least about 
nucleodde long contiguous sequence ma, has me same sequence as. or is 
complementary to, a 15 nucleodde long condguous seouence of a polynucleotide to be 
detected win find particular utihty. ^ condguous idendcal or complementary 
sequences, , s ., those of about 20. 30, 40, 50. 100, 200, 500, 1000 (including an 
mtermediate ,engms> and even up ,„ w fcngth mfm , ^ ^ ^ rf ^ ^ 
embodiments. 

15 Oligonucleodde primers having sequence regions consisting of condguous 

nucleodde stretches of ,0-14, 15-20, 30, 50, or even of ,00-200 nudeodde, or so 
(tncluding intennediate lengdts as we,,), identica, or comp,eme„«a^ ,„ . polynuCeonde to 
be delected, am particmariy con,emp,a.ed as primers for use in amplification reacdons such 
as, eg., the payment chain teacdon (PCR™). This would aUow a polynucleotide to be 

20 analyzed, bom in diverse biological samples such as, ,orexamp,e, blood. !ymph nodes and' 
bone marrow. 

11,euseof >P'ta=rofabou.l5-25nucleodde S in 1 eng t ha„owsthefortnadonofa 
duplex molecule that is both stab,e and selecdve. Modules having condguous 
comp.emen.aty sequences over stretches gmater than ,5 bases in ,ength am genemJiy 
preferred, though, in otder to increase stabUity and selecdvily of the hybrid, and thereby 
tmpmve the ,„ali,y and degme of specific hybrid molecules obtained One will generally 
prefer ,o design primem having gene-complementeiy stretches of 15 t „ 25 condguous 
nucleotides, or even longer where desired. 
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Primers may be selected from any portion of the polynucleotide to be detected. All 
that is required is to review the sequence, such as those exemplary polynucleotides set forth 
herein or to any continuous portion of the sequence, from about 15-25 nucleotides in length 
up to and including the full length sequence, that one wishes to utilize as a primer. The 
5 choice of primer sequences may be governed by various factors. For example, one may 
wish to employ primers from towards the termini of the total sequence. The exemplary 
primers disclosed herein may optionally be used'for their ability to selectively form duplex 
molecules with complementary stretches of the entire polynucleotide of interest such as 
those set forth SEQ ID NOs: 1, 3 ,5, 7, 21 and 26. 
10 The present invention further provides the nucleotide sequence of various 

exemplary oligonucleotide primers and probes, that may be used, as described in further 
detail herein, according to the methods of the present invention for the detection of cancer. 

Oligonucleotide primers according to the present invention may be readily prepared 
routinely by methods commonly available to the skilled artisan including, for example, 
15 directly synthesizing the primers by chemical means, as is commonly practiced using an 
automated oligonucleotide synthesizer. Depending on the application envisioned, one will 
typically desire to employ varying conditions of hybridization to achieve varying degrees 
of selectivity of probe towards target sequence. For applications requiring high selectivity, 
one will typically desire to employ relatively stringent conditions to form the hybrids, e.g., 
20 one will select relatively low salt and/or high temperature conditions, such as provided by a 
salt concentration of from about 0.02 M to about 0.15 M salt at temperatures of from about 
50°C to about 70°C. Such selective conditions tolerate little, if any, mismatch between the 
probe and the template or target strand, and would be particularly suitable for isolating 
related sequences. 
25 Polynucleotide Amplifi cation Techniq ue 

Each of the specific embodiments outlined herein for the detection of lung cancer 
has in common the detection of a tissue- and/or tumor-specific polynucleotide via the 
hybridization of one or more oligonucleotide primers and/or probes. Depending on such 
factors as the relative number of cancer cells present in the biological sample and/or the 
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level of polynucleotide expression within each lung cancer cell, it may be preferred to 
perform an amplification step prior to performing the steps of detection. For example, at 
least two oligonucleotide primers may be employed in a polymerase chain reaction (PCR) 
based assay to amplify a portion of a lung tumor cDNA derived from a biological sample, 
5 wherein at least one of the oligonucleotide primers is specific for (i.e., hybridizes to) a 
polynucleotide encoding the lung tumor protein. The amplified cDNA may optionally be 
subjected to a fractionation step such as, for example, gel electrophoresis. 

A number of template dependent processes are available to amplify the target 
sequences of interest present in a sample. One of the best known amplification methods is 
10 the polymerase chain reaction (PCR™) which is described in detail in U.S. Patent Nos 
4,683,195, 4,683,202 and 4,800,159. Briefly, in PCR™, two primer sequences are 
prepared which are complementary to regions on opposite complementary strands of the 
target sequence. An excess of deoxynucleoside triphosphates is added to a reaction 
mixture along with a DNA polymerase (e.g., Tag polymerase). If the target sequence is 
15 present in a sample, the primers will bind to the target and the polymerase will cause the 
primers to be extended along the target sequence by adding on nucleotides. By raising and 
lowering the temperature of the reaction mixture, the extended primers will dissociate from 
the target to form reaction products, excess primers will bind to the target and to the 
reaction product and the process is repeated. Preferably reverse transcription and PCR™ 
20 amplification procedure may be performed in order to quantify the amount of mRNA 
amplified. Polymerase chain reaction methodologies are well known in the art. 

One preferred methodology for polynucleotide amplification employs RT-PCR, in 
which PCR is applied in conjunction with reverse transcription. Typically, RNA is 
extracted from a biological sample, such as blood, serum, lymph node, bone marrow, 
25 sputum, urine and tumor biopsy samples, and is reverse transcribed to produce cDNA 
molecules. PCR amplification using at least one specific primer generates a cDNA 
molecule, which may be separated and visualized using, for example, gel electrophoresis. 
Amplification may be performed on biological samples taken from a patient and from an 
individual who is not afflicted with a cancer. The amplification reaction may be performed 
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on several dilutions of cDNA spanning two orders of magnitude. ,A two-fold or greater 
increase in expression in several dilutions of the test patient sample as compared to the 
same dilutions of the non-cancerous sample is typically considered positive. 

Any of a variety of commercially available kits may be used to perform the 
amplification step. One such amplification technique is inverse PCR (see Triglia etal., 
NucL Acids Res. 75:8186, 1988), which uses restriction enzymes to generate a fragment in 
the known region of the gene. The fragment is then circularized by intramolecular ligation 
and used as a template for PCR with divergent primers derived from the known region. 
Within an alternative approach, sequences adjacent to a partial sequence may be retrieved 
by amplification with a primer to a linker sequence and a primer specific to a known 
region. The amplified sequences are typically subjected to a second round of amplification 
with the same linker primer and a second primer specific to the known region. A variation 
on this procedure, which employs two primers that initiate extension in opposite directions 
from the known sequence, is described in WIPO International Patent Application No.: WO 
96/38591. Another such technique is known as "rapid amplification of cDNA ends" or 
RACE. This technique involves the use of an internal primer and an external primer, 
which hybridizes to a polyA region or vector sequence, to identify sequences that are 5' and 
3' of a known sequence. Additional techniques include capture PCR (Lagerstrom et al, 
PCR Methods Applic. 7:111-19, 1991) and walking PCR (Parker et al., Nucl. Acids. Res. 
79:3055-60, 1991). Other methods employing amplification may also be employed to 
obtain a full length cDNA sequence. 

Another method for amplification is the ligase chain reaction (referred to as LCR), 
disclosed in Eur. Pat. Appl. Publ. No. 320,308. In LCR, two complementary probe pairs 
are prepared, and in the presence of the target sequence, each pair will bind to opposite 
complementary strands of the target such that they abut. In the presence of a ligase, the 
two probe pairs will link to form a single unit. By temperature cycling, as in PCR™, bound 
ligated units dissociate from the target and then serve as "target sequences" for ligation of 
excess probe pairs. U.S. Patent No. 4,883,750, describes an alternative method of 
amplification similar to LCR for binding probe pairs to a target sequence. 
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Qbeta Replicase, described in PCT Intl. Pat. Appl. Publ. No. PCT/US87/00880, 
may also be used as still another amplification method in the present invention. In this 
method, a replicative sequence of RNA that has a region complementary to that of a target 
is added to a sample in the presence of an RNA polymerase. The polymerase will copy the 
replicative sequence that can then be detected. 

An isothermal amplification method, in which restriction endonucleases and ligases 
are used to achieve the amplification of target molecules that contain nucleotide 
5*-[a-thio]triphosphates in one strand of a restriction site (Walker et al, 1992), may also be 
useful in the amplification of nucleic acids in the present invention. 

Strand Displacement Amplification (SDA) is another method of carrying out 
isothermal amplification of nucleic acids which involves , multiple rounds of strand 
displacement and synthesis, Le. nick translation. A similar method, called Repair Chain 
Reaction (RCR) is another method of amplification which may be useful in the present 
invention and is involves annealing several probes throughout a region targeted for 
amplification, followed by a repair reaction in which only two of the four bases are present. 
The other two bases can be added as biotinylated derivatives for easy detection. A similar 
approach is used in SDA. 

Sequences can also be detected using a cyclic probe reaction (CPR). In CPR, a 
probe having a 3' and 5' sequences of non-target DNA and an internal or "middle" 
sequence of the target protein specific RNA is hybridized to DNA which is present in a 
sample. Upon hybridization, the reaction is treated with RNaseH, and the products of the 
probe are identified as distinctive products by generating a signal that is released after 
digestion. The original template is annealed to another cycling probe and the reaction is 
repeated. Thus, CPR involves amplifying a signal generated by hybridization of a probe to 
a target gene specific expressed nucleic acid. 

Still other amplification methods described in Great Britain Pat Appl. No. 2 202 
328, and in PCT Intl. Pat. Appl. Publ. No. PCT/US89/01025, may be used in accordance 
with the present invention. In the former application, "modified" primers are used in a 
PCR-like, template and enzyme dependent synthesis. The primers may be modified by 
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labeling with a capture moiety (*.*. biotin) and/or a detector moiety (e.g., enzyme). In the 
latter application, an excess of labeled probes is added to a sample. In the presence of the 
target sequence, the probe binds and is cleaved catalytically. After cleavage, the target 
sequence is released intact to be bound by excess probe. Cleavage of the labeled probe 
5 signals the presence of the target sequence. 

Other nucleic acid amplification procedures include transcription-based 
amplification systems (J AS) (Kwoh et aL, 1989; PCT Intl. Pat. Appl. Publ. No. WO 
88/10315), including nucleic acid sequence based amplification (NASBA) and 3SR. In 
NASBA, the nucleic acids can be prepared for amplification by standard 
10 phenol/chloroform extraction, heat denaturation of a sample, treatment with lysis buffer 
and minispin columns for isolation of DNA and RNA or guanidinium chloride extraction 
of RNA. These amplification techniques involve annealing a primer that has sequences 
specific to the target sequence. Following polymerization, DNA/RNA hybrids are digested 
with RNase H while double stranded DNA molecules are heat-denatured again, m either 
15 case the single stranded DNA is made fully double stranded by addition of second target- 
specific primer, followed by polymerization. The double stranded DNA molecules are then 
multiply transcribed by a polymerase such as T7 or SP6. In an isothermal cyclic reaction, 
the RNAs are reverse transcribed into DNA, and transcribed once again with a polymerase 
such as 17 or SP6. The resulting products, whether truncated or complete, indicate target- 
20 specific sequences. 

Eur. Pat. Appl. Publ. No. 329,822, disclose a nucleic acid amplification process 
involving cyclically synthesizing single-stranded RNA ("ssRNA"), ssDNA, and 
double-stranded DNA (dsDNA), which may be used in accordance with the present 
invention. The ssRNA is a first template for a first primer oligonucleotide, which is 
elongated by reverse transcriptase (RNA-dependent DNA polymerase). The RNA is then 
removed from resulting DNA:RNA duplex by the action of ribonuclease H (RNase H, an 
RNase specific for RNA in a duplex with either DNA or RNA). The resultant ssDNA is a 
second template for a second primer, which also includes the sequences of an RNA 
polymerase promoter (exemplified by T7 RNA polymerase) 5' to its homology to its 
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template. This primer is then extended by DNA polymerase (exemplified by the large 
"Klenow" fragment of E. coli DNA polymerase I), resulting as a double-stranded DNA 
("dsDNA") molecule, having a sequence identical to that of the original RNA between the 
primers and having additionally, at one end, a promoter sequence. This promoter sequence 
5 can be used by the appropriate RNA polymerase to make many RNA copies of the DNA. 
These copies can then re-enter the cycle leading to very swift amplification. With proper 
choice of enzymes, this amplification can be done isothermally without addition of 
enzymes at each cycle. Because of the cyclical nature of this process, the starting sequence 
can be chosen to be in the form of either DNA or RNA. 
10 PCT Intl. Pat. Appl. Publ. No. w6 89/06700, disclose a nucleic acid sequence 

amplification scheme based on the hybridization of a promoter/primer sequence to a target 
single-stranded DNA ("ssDNA") followed by transcription of many RNA copies of the 
sequence. This scheme is not cyclic; i.e. new templates are not produced from the resultant 
RNA transcripts. Other amplification methods include "RACE" (Frohman, 1990), and 
15 "one-sided PCR" (Ohara, 1989) which are well-known to those of skill in the art. 

Compositions a nd Kits for the Detection of Cancer 

The present invention further provides kits for use within any of the above 
diagnostic methods. Such kits typically comprise two or more components necessary for 
performing a diagnostic assay. Components may be compounds, reagents, containers 

20 and/or equipment. For example, one container within a kit may contain a monoclonal 
antibody or fragment thereof that specifically binds to a lung tumor protein. Such 
antibodies or fragments may be provided attached to a support material, as described 
above. One or more additional containers may enclose elements, such as reagents or 
buffers, to be used in the assay. Such kits may also, or alternatively, contain a detection 

>5 reagent as described above that contains a reporter group suitable for direct or indirect 
detection of antibody binding. 

The present invention also provides kits that are suitable for performing the 
detection methods of the present invention. Exemplary kits comprise oligonucleotide 
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primer pairs each one of which specifically hybridizes to a distinct polynucleotide. Within 
certain embodiments, kits according to the present invention may also comprise a nucleic 
acid polymerase and suitable buffer. Exemplary oligonucleotide primers suitable for kits 
of the present invention are disclosed herein. Exemplary polynucleotides suitable for kits 
5 of the present invention are disclosed herein. 

Alternatively, a kit may be designed to detect the level of mRNA encoding a lung 
tumor protein in a biological sample. Such kits generally comprise at least one 
oligonucleotide probe or primer, as described above, that hybridizes to a polynucleotide 
encoding a lung tumor protein. Such an oligonucleotide may be used, for example, within 
10 a PCR or hybridization assay. Additional components that may be present within such kits 
include a second oligonucleotide and/or a diagnostic reagent or container to facilitate the 
detection of a polynucleotide encoding a lung tumor protein. 

In other related aspects, the present invention further provides compositions useful 
in the methods disclosed herein. Exemplary compositions comprise two or more 
15 oligonucleotide primer pairs each one of which specifically hybridizes to a distinct 
polynucleotide. Exemplary oligonucleotide primers suitable for compositions of the 
present invention are disclosed herein. Exemplary polynucleotides suitable for 
compositions of the present invention are disclosed herein. 

The following Example is offered by way of illustration and not by way of 
20 limitation. 



27 



Docket No. 609P2 



EXAMPLES 
EXAMPLE 1 
Multiplex Detection of Lung Tumors 
A Multiplex Real-time PCR assay was established in order to simultaneously detect 
5 the expression of four lung cancer-specific genes: L762 (SEQ ID NO:l), L984 (SEQ ID 
NO:3), L550 (SEQ ID NO:5) and L552 (SEQ ID NO:7). In contrast to detection 
approaches relying on expression analysis of single lung cancer-specific genes, this 
Multiplex assay was able to detect all lung tumor samples tested and analyze their 
combined mRNA expression profile in adenocarcinoma, squamous, small cell and large 
10 cell lung tumors. L552S and L550S complement each other in detecting predominantly 
adenocarcinomas, L762S detects squamous cell carcinomas and L984P detects small cell 
carcinomas (see Table 1). 

The primers and probes were designed to be intron spanning (exon specific) to 
eliminate any reactivity with genomic DNA making them suitable for use in blood samples 
15 without having to DNAse treat mRNA samples. They were also designed to produce 
amlicons of different sizes to allow gel differentiation of end products if necessary. 

The assay was carried out as follows: L552S (SEQ ID NO: 7), L550 (SEQ ID NO: 
5), L762 (SEQ ID NO: 1), L984 (SEQ ID NO: 3) and specific primers, and specific 
Taqman probes, were used to analyze their combined mRNA expression profile in lung 
20 tumors. The primers and probes are shown below: 

L552S: Forward Primer (SEQ ID NO:9): 5* GACGGCATGAGCGACACACA. Reverse 
Primer (SEQ ID NO: 10): 5' CCATGTCGCGCACTGGGATC. Probe (SEQ ID NO: 11) 
(FAM-5' - 3'-TAMRA): CTGAAAGTCGGGATCCTACACCTGGGCA. 

25 L550P: Forward Primer (SEQ ID NO:12): 5' GGCCACCGTCTGGATTCTTC. Reverse 
Primer (SEQ ID NO: 13): 5' GAAGAATCCAGACGGTGGCC. Probe (SEQ ID NO: 14) 
(FAM-5* - 3*-TAMRA): CCGCCCCAAG ATCAAATCCA CAAACC. 



28 



Docket No. 609P2 



L762S: Forward Primer (SEQ ID NO: 15): 5' ATGGCAGAGGCTGACAGACTC. 
Reverse Primer (SEQ ID NO:16): 5' TTCAACCACCTCAAATCCHTTCTTA. Probe 
(SEQ ID NO: 17) (FAM-5' - 3*-TAMRA) TCGACAGCAAAGGAGAGATCAGAGCCC. 

5 L984P: Forward Primer (SEQ ID NO: 18): 5' TTACGACCCGCTCAGCCC. Reverse 
Primer (SEQ ID NO:19): 5' CTCCCAACGCCACTGACAA. Probe (SEQ ID NO:20) 
(FAM-5* — 3'-TAMRA): CCAGGCCGAGCCCCTCAGAACC. 

The assay conditions were: 
Taaman vrotocol (7700 Perkin Elmer) : 

10 In 25 jlxI final volume: ■ lx Buffer A, 5mM MgCl, 0.2 mM dCTP, 0.2 mM dATP, 0.4 

mM dUTP, 0.2 mM dGTP, 0.01 U/|d AmpErase UNG, 0.0375 U/jil TaqGold, 8% (v/v) 
Glycerol, 0.05% (v/v) (Sigma), Gelatin, 0.05% (v/v) (Sigma), Tween20 0.1% v/v (Sigma), 
300mM of each forward and reverse primer for L762P, 50mM of each forward and reverse 
primer from (L552S, L984P, L550S, L984P) 2 pmol of each gene specific Taqman probe 

15 (L552S, L550S, L984P) and template cDNA. The PGR reaction was carried out at one 
cycle at 95°C for 10 minutes, followed by 50 cycles at 95°C for 15 seconds, 60°C for 1 
minute, and 68°C for 1 minute (ABI Prism 7900HO Sequence Detection System, Foster 
City, CA). 

Since each primer set in the multiplex assay results in a band of unique length, 
20 expression signals of the four genes of interest was measured individually by agarose gel 
analysis. The combined expression signal of all four genes can also be measured in real- 
time on an ABI 7700 Prism sequence detection system (Applied Biosystems, Foster City, 
CA). Although specific primers have been described herein, different primer sequences, 
different primer or probe labeling and different detection systems could be used to perform 
25 this multiplex assay. For example, a second fluorogenic reporter dye could be incorporated 
for parallel detection of a reference gene by real-time PCR. Or, for example a SYBR 
Green detection system could be used instead of the Taqman probe approach. Table 2 
shows the reactivity of the multiplex PCR with different lung tumor types and normal lung 
tissue. 
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TABLE 2 Expression of Lung Cancer Multiplex Genes (L762P, L552S, L550S, L984P) in 

Lung Tumor and Normal Lung 



Lung Tumor Type 


Positive Samples/Samples Tested 


Adenocarcinoma 


21/24 


Squamous 


17/18 


Large Cell 


5*/5 


Small Cell 


5/6 


Normal Lung Tissue 


0/12 


Total Tumors 


48/53 


% Positive Tumors 


90.57% 



5 Cut-off Value = Mean normal lung +3 SD =0.901 
* One sample at cut-off 

EXAMPLE 2 
Multiplex Detection of Lung Tumors 

10 

Six additional Multiplex Real-time PCR assays were established in order to 
simultaneously detect the expression of various combinations of recognized lung antigens: 
L762 (SEQ ID NO:l), L984 (SEQ ID NO:3), L550 (SEQ ID NO:5), L552 (SEQ ID NO:7), 
L763 (SEQ ID NO: 21) and L587 (SEQ ID NO:26). The six groups consisted of: 
15 Group 1 : L762, L552, L550 and L984 

Group 2: L763, L552, L550 and L984 
Group 3: L763, L552, L587 and L984 
Group 4: L763, L550, L587 and L984 
Group 5: L763, L550 and L587 
20 Group 6: L762, L984, L550 and L587 

The assays were carried out described above in Example 1 to analyze the combined 
mRNA expression profile in lung tumors. The primers and probes for L552S, L550P, 
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L762S, L984P are as described in Example 1. primers and probes for L763 and L587 are 
described below: 

L763S: Forward Primer (SEQ ID NO:23): 5' ATTCCAGGCGACATCCTCACT. 
Reverse Primer (SEQ ID NO:24): 5' GTTTATCCCTGAGTCCTGTTTCCA. Probe (SEQ 
5 ID NO:25) (FAM-5' - 3'-TAMRA): TGTGCACCATTGGCTTCTAGGCACTCC. 

L587: Forward Primer (SEQ ID NO:28): 5' CCCAGAGCTGTGTTAAGGGATC. 
Reverse Primer (SEQ ID NO:29): 5* GTTAAGCGGGATTTCATGTACGA. Probe (SEQ 
ID NO:30) (FAM-5' - 3'-TAMRA): AGAACCTGAACCCGTAAAGAAGCCTCCC. 

0 

The lung antigens that make up the six multiplex assays are able to detect all lung 
tumor samples tested and were analyzed for their combined mRNA expression profile in 
adenocarcinoma, squamous, small cell and large cell lung tumors. The results of these 
assays is presented in Table 3. 



15 

TABLE 3 Expression of Lung Cancer Multiplex Genes in Lung Tumor and Normal Lung 



Lung Tumor 
Type 


Positive Samples/Samples Tested 




Group 1 


Group 2 


Group 3 


Group 4 


Group 5 


Group 6 


Adenocarcinoma 


21/24 


21/24 


20/24 


22/24 


22/24 


22/24 


Squamous 


17/18 


17/18 


18/18 


18/18 


18/18 


18/18 | 


Large Cell 


5/5 ! 


3/5 


4/5 


3/5 


3/5 


4/5 


Small Cell 


1/2 ; 


1/2 


1/2 


2/2 


1/2 


2/2 


Other 


2/2 


2/2 


2/2 


2/2 


2/2 


2/2 


Normal Lung 
Tissue 


0/12 


0/12 


0/12 


0/13 


0/13 


0/13 
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Total Tumors 


46/51 


44/51 


45/51 


47/51 


46/51 


48/51 


% Positive 
Tumors 


90.20% ' 
CO= 0.9 


86.27% 
CO=4.7 


88.24% 
CO=1.08 


92.16% 
CO=1.88 


90.20% 
CO=2.2 


94.12% 
CO=5.5 



Cut-off Value (CO) = Mean nonnal lung +3 SD 



Mulitplex assays using groups 1, 4 and 6 were next used to detect circulating tumor 
cells in peripheral blood samples from 17 lung cancer patients undergoing various types of 
5 treatments. In addition, a single gene assay using lung antigen L523 (SEQ ID NO:31) was 
carried out in parallel using the primers as described in SEQ ID NOs:33 and 34. Six 
normal donors were included as controls. The assays were carried out as described above 
in Example 1. The cut off value for detection in the assay being the mean of the nonnal 
lung samples + 3 standard deviations. 

10 Group 1 antigens were detected in 5/17 samples tested. Group 4 antigens were 

detected in 4/17 samples and Group 6 antigens were detected in 8/17 samples. L523 was 
detected as a single gene in 7/17 samples tested. The combination of antigens in Group 6 
was the most sensitive for lung tumor detection in tissue and blood of the groups tested. 

From the foregoing it will be appreciated that, although specific embodiments of 

15 the invention have been described herein for purposes of illustration, various modifications 
may be made without deviating from the spirit and scope of the invention. Accordingly, 
the invention is not limited except as by the appended claims. 
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CLAIMS 

We Claim: 

1. A method for detecting the presence of a cancer cell in a patient, said 
method comprising the steps of: 

(a) obtaining a biological sample from said patient; 

(b) contacting the biological sample with two or more oligonucleotide 
pairs specific for independent polynucleotide sequences which are unrelated to one anpther, 
wherein the oligonucleotide pairs hybridize, under moderately stringent conditions, to their 
respective polynucleotides and the complements thereof; 

(c) amplifying said polynucleotides; and 

(d) detecting said amplified polynucleotides; 

wherein the presence of one or more of said amplified polynucleotides 
indicates the presence of lung cancer cells in said patient. 

2. A method for determining the presence of lung cancer cells in a 
patient, said method comprising the steps of: 

(a) obtaining a biological sample from said patient; 

(b) contacting a biological sample obtained from the patient with two 

or more oligonucleotides that hybridize to two or more polynucleotides that encode two or 
i more lung tumor proteins; 

(c) detecting in said biological sample an amount of a polynucleotide 
that hybridizes to at least one of said oligonucleotides; and 

(d) comparing the amount of the polynucleotides that hybridizes to said 
oligonucleotides to a predetermined cut-off value, and therefrom determining the presence 
or absence of lung cancer cells in the patient. 

3. A method for monitoring the progression of lung cancer in a patient, 
said method comprising the steps of: 
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(a) obtaining a first biological sample from said patient; 

(b) contacting said first biological sample with one or more 
oligonucleotides that hybridize to one or more polynucleotides that encode lung tumor 
proteins; 

(c) detecting in said first biological sample an amount of at least one of 
said polynucleotides that hybridize to said oligonucleotides; 

(d) repeating steps (b) and (c) using a second biological sample obtained 
from said patient at a subsequent point in time; and 

(e) comparing the amount of polynucleotides detected in step (d) with 
the amount detected in step (c) and therefrom monitoring the progression of lung cancer in 
said patient. 
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METHODS, COMPOSITIONS AND KITS 
FOR THE DETECTION AND MONITORING OF LUNG CANCER 

ABSTRACT OF THE DISCLOSURE 

Compositions and methods for the diagnosis of lung cancer are disclosed Such 
methods are useful to detect early tumors or provide adequate stage/grade information or 
tumor specificity. Compositions may comprise one or more lung tumor proteins, 
immunogenic portions thereof, or polynucleotides . that encode such portions. Such 
compositions may be used, for example, to improve lung cancer diagnosis and prognosis 
and potentially differentiate between NSCLC and SCLC. 
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EXPRESS MAIL NO: EV324206033US 



SEQUENCE LISTING 

<110> Zehentner-Wilkinson, Barbara K. 
Hayes, Dawn 
Houghton, Raymond L . 

<120> METHODS, COMPOSITIONS AND KITS FOR THE DETECTION 
AND MONITORING OF LUNG CANCER 

<130> 609P2 

<140> US 

<141> 2003-09-15 

<160> 34 

<170> Corixa Invention Disclosure Database 

<210> 1 

<211> 3951 

<212> DNA 

<213> Homo sapiens 

<400> 1 

tctgcatcca tattgaaaac ctgacacaat gtatgcagca ggctcagtgt gagtgaactg 60 
gaggcttctc tacaacatga cccaaaggag cattgcaggt cctatttgca acctgaagtt 120 
tgtgactctc ctggttgcct taagttcaga actcccattc ctgggagctg gagtacagct 180 
tcaagacaat gggtataatg gattgctcat tgcaattaat cctcaggtac ctgagaatca 240 
gaacctcatc tcaaacatta aggaaatgat aactgaagct tcattttacc tatttaatgc 300 
taccaagaga agagtatttt tcagaaatat aaagatttta atacctgcca catggaaagc 360 
taataataac agcaaaataa aacaagaatc atatgaaaag gcaaatgtca tagtgactga 420 
ctggtatggg gcacatggag atgatccata caccctacaa tacagagggt gtggaaaaga 480 
gggaaaatac attcatttca cacctaattt cctactgaat gataacttaa cagctggcta 540 
cggatcacga ggccgagtgt ttgtccatga atgggcccac ctc ( cgttggg gtgtgttcga 600 
tgagtataac aatgacaaac ctttctacat aaatgggcaa aatcaaatta aagtgacaag 660 
gtgttcatct gacatcacag gcatttttgt gtgtgaaaaa ggtccttgcc cccaagaaaa 720 
ctgtattatt agtaagcttt ttaaagaagg atgcaccttt atctacaata gcacccaaaa 780 
tgcaactgca tcaataatgt tcatgcaaag tttatcttct gtggttgaat tttgtaatgc 840 
aagtacccac aaccaagaag caccaaacct acagaaccag atgtgcagcc tcagaagtgc 900 
atgggatgta atcacagact ctgctgactt tcaccacagc tttcccatga acgggactga 960 
gcttccacct cctcccacat tctcgcttgt agaggctggt gacaaagtgg tctgtttagt 1020 
gctggatgtg tccagcaaga tggcagaggc tgacagactc cttcaactac aacaagccgc 1080 
agaattttat ttgatgcaga ttgttgaaat tcataccttc gtgggcattg ccagtttcga 1140 
cagcaaagga gagatcagag cccagctaca ccaaattaac agcaatgatg atcgaaagtt 1200 
gctggtttca tatctgccca ccactgtatc agctaaaaca gacatcagca tttgttcagg 1260 
gcttaagaaa ggatttgagg tggttgaaaa actgaatgga aaagcttatg gctctgtgat 1320 
gatattagtg accagcggag atgataagct tcttggcaat tgcttaccca ctgtgctcag 1380 
cagtggttca acaattcact ccattgccct gggttcatct gcagccccaa atctggagga 1440 
attatcacgt cttacaggag gtttaaagtt ctttgttcca gatatatcaa actccaatag 1500 
catgattgat gctttcagta gaatttcctc tggaactgga gacattttcc agcaacatat 1560 
tcagcttgaa agtacaggtg aaaatgtcaa acctcaccat caattgaaaa acacagtgac 1620 
tgtggataat actgtgggca acgacactat gtttctagtt acgtggcagg ccagtggtcc 1680 
tcctgagatt atattatttg atcctgatgg acgaaaatac tacacaaata attttatcac 1740 
caatctaact tttcggacag ctagtctttg gattccagga acagctaagc ctgggcactg 1800 
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gacttacacc ctgaacaata cccatcattc tctgcaagcc ctgaaagtga cagtgacctc 1860 
tcgcgcctcc aactcagctg tgcccccagc cactgtggaa gcctttgtgg aaagagacag 1920 
cctccatttt cctcatcctg tgatgattta tgccaatgtg aaacagggat tttatcccat 1980 
tcttaatgcc actgtcactg ccacagttga gccagagact ggagatcctg ttacgctgag 2040 
ff^" t ? at ^{JSragcag gtgctgatgt tataaaaaat gatggaattt actcgaggta 2100 
ttttttctcc tttgctgcaa atggtagata tagcttgaaa gtgcatgtca atcactctcc 2160 
cagcataagc accccagccc actctattcc agggagtcat gctatgtatg taccaggtta 2220 
cacagcaaac ggtaatattc agatgaatgc tccaaggaaa tcagtaggca gaaatgagga 2280 
ggagcgaaag tggggcttta gccgagtcag ctcaggaggc tccttttcag tgctgggagt 2340 
tccagctggc ccccaccctg atgtgtttcc accatgcaaa attattgacc tggaagctgt 2400 
aaaagtagaa gaggaattga ccctatcttg gacagcacct ggagaagact ttgatcaggg 2460 
ccaggctaca agctatgaaa taagaatgag taaaagtcta cagaatatcc aagatgactt 2520 
taacaatgct attttagtaa atacatcaaa gcgaaatcct cagcaagctg gcatcaggga 2580 
gatatttacg ttctcacccc aaatttccac gaatggacct gaacabcagc caaatggaga 2640 
aacacatgaa agccacagaa tttatgttgc aatacgagca atggatagga actccttaca 2700 
gtctgctgta tctaacattg cccaggcgcc tctgtttatt ccccccaatt ctgatcctgt 2760 
^S^HSHt ga " atc "a tattgaaagg agttttaaca gcaatgggtt tgataggaat 2820 
catttgcctt attatagttg tgacacatca tactttaagc aggaaaaaga gagcagacaa 2880 
gaaagagaat ggaacaaaat Cattataaat aaatatccaa agtgtcttcc ttcttagata 2940 
taagacccat ggccttcgac tacaaaaaca tactaacaaa gtcaaattaa catcaaaact 3000 
f= a ^ aa ff! scattgagtt tttgtacaat acagataaga tttttacatg gtagatcaac 3060 
aaattctttt tgggggtaga ttagaaaacc cttacacttt ggctatgaac aaataataaa 3120 
K^iiSi ^ aa ^ a " g tctttaaa 9S caaagggaag ggtaaagtcg gaccagtgtc 3180 
aaggaaagtt tgttttattg aggtggaaaa atagccccaa gcagagaaaa ggagggtagg 3240 
tctgcattat aactgtctgt gtgaagcaat catttagtta ctttgattaa tttttctttt 3300 
ctccttatct gtgcagaaca ggttgcttgt ttacaactga agatcatgct atatttcata 3360 
tatgaagccc ctaatgcaaa gctctttacc tcttgctatt ttgttatata tattacagat 3420 
gaaatctcac tgctaatgct cagagatctt ttttcactgt aagaggtaac ctttaacaat 3480 
atgggtatta cctttgtctc ttcataccgg ttttatgaca aaggtctatt gaatttattt 3540 
gtttgtaagt ttctactccc atcaaagcag ctttttaagt tattgccttg gttattatgg 3600 
atgatagtta tagcccttat aatgccttaa ctaaggaaga aaagatgtta Ctctgagttt 3660 
gttttaatac atatatgaac atatagtttt attcaattaa accaaagaag aggtcagcag 3720 
ggagatacta acctttggaa atgattagct ggctctgttt tttggttaaa taagagtctt 3780 
taatcctttc tccatcaaga gttacttacc aagggcaggg gaagggggat atagaggtcc 3840 
caaggaaata aaaatcatct ttcatcttta attttactcc ttcctcttat ttttttaaaa 3900 
gattatcgaa caataaaatc atttgccttt ttaattaaaa acataaaaaa a 3951 

<210> 2 
<211> 943 
<212> PRT 

<213> Homo sapiens 
<400> 2 

Met Thr Gin Arg ser He Ala Gly Pro He Cys Asn Leu Lys Phe Val 

1 5 10 15 

Thr Leu Leu Val Ala Leu Ser Ser Glu Leu Pro Phe Leu Gly Ala Gly 

20 25 30 

Val Gin Leu Gin Asp Asn Gly Tyr Asn Gly Leu Leu He Ala He Asn 

35 40 45 

Pro Gin Val Pro Glu Asn Gin Asn Leu He Ser Asn He Lys Glu Met 

50 55 60 

He Thr Glu Ala Ser Phe Tyr Leu Phe Asn Ala Thr Lys Arg Arg Val 

5 70 75 80 

Phe Phe Arg Asn He Lys He Leu He Pro Ala Thr Trp Lys Ala Asn 
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85 90 95 

Asn Asn Ser Lys lie Lys Gin Glu Ser Tyr Glu Lys Ala Asn Val He 

100 105 HO 

Val Thr Asp Trp Tyr Gly Ala His Gly Asp Asp Pro Tyr Thr Leu Gin 

115 120 125 

Tyr Arg Gly Cys Gly Lys Glu Gly Lys Tyr' He His Phe Thr Pro Asn 

130 135 140 

Phe Leu Leu Asn Asp Asn Leu Thr Ala Gly Tyr Gly Ser Arg Gly Arg 
145 150 155 160 

Val Phe Val His Glu Trp Ala His Leu Arg Trp Gly Val Phe Asp Glu 

165 170 175 

Tyr Asn Asn Asp Lys Pro Phe Tyr He Asn Gly Gin Asn Gin He Lys 

180 185 190 

Val Thr Arg Cys Ser Ser Asp He Thr Gly He Phe Val Cys Glu Lys 

195 200 205 

Gly Pro Cys Pro Gin Glu Asn Cys He He Ser Lys Leu Phe Lys Glu 

210 215 220 

Gly Cys Thr Phe He Tyr Asn Ser Thr Gin Asn Ala Thr Ala Ser He 
225 230 235 240 

Met Phe Met Gin Ser Leu Ser Ser Val Val Glu Phe Cys Asn Ala Ser 

245 250 255 

Thr His Asn Gin Glu Ala Pro Asn Leu Gin Asn Gin Met Cys Ser Leu 

260 265 270 

Arg Ser Ala Trp Asp Val He Thr Asp Ser Ala Asp Phe His His Ser 

275 280 285 

Phe Pro Met Asn Gly Thr Glu Leu Pro Pro Pro Pro Thr Phe Ser Leu 

290 295 300 

Val Glu Ala Gly Asp Lys Val Val Cys Leu Val Leu Asp Val Ser Ser 
305 310 315 320 

Lys Met Ala Glu Ala Asp Arg Leu Leu Gin Leu Gin Gin Ala Ala Glu 

325 - 330 335 

Phe Tyr Leu Met Gin He Val Glu He His Thr Phe Val Gly He Ala 

340 345 350 

Ser Phe Asp Ser Lys Gly Glu He Arg Ala Gin Leu His Gin He Asn 

355 360 365 

Ser Asn Asp Asp Arg Lys Leu Leu Val Ser Tyr Leu Pro Thr Thr Val 

370 375 380 

Ser Ala Lys Thr Asp He Ser He Cys Ser Gly Leu Lys Lys Gly Phe 
385 390 395 400 

Glu Val Val Glu Lys Leu Asn Gly Lys Ala Tyr Gly Ser Val Met He 

405 410 415 

Leu Val Thr Ser Gly Asp Asp Lys Leu Leu Gly Asn Cys Leu Pro Thr 

420 425 ~ 430 

Val Leu Ser Ser Gly Ser Thr He His Ser He Ala Leu Gly Ser Ser 

435 440 445 1 

:Ala Ala Pro Asn Leu Glu Glu Leu Ser Arg Leu Thr Gly Gly Leu Lys 

450 455 460 

Phe Phe Val Pro Asp He Ser Asn Ser Asn Ser Met He Asp Ala Phe 
465 470 475 ~ 480 

Ser Arg He Ser Ser Gly Thr Gly Asp He Phe Gin Gin His He Gin 

485 490 495 

Leu Glu Ser Thr Gly Glu Asn Val Lys Pro His His Gin Leu Lys Asn 

500 505 510 

Thr Val Thr Val Asp Asn Thr Val Gly Asn Asp Thr Met Phe Leu Val 



4 



515 520 525 

Thr Trp Gin Ala Ser Gly Pro Pro Glu lie lie Leu Phe Asp Pro Asp 

530 535 540 

Gly Arg Lys Tyr Tyr Thr Asn Asn Phe lie Thr Asn Leu Thr Phe Arg 
545 550 555 560 

Thr Ala Ser Leu Trp lie Pro Gly Thr Ala Lys Pro Gly His Trp Thr 

565 570 575 

Tyr Thr Leu Asn Asn Thr His His Ser Leu Gin Ala Leu Lys Val Thr 

580 > 585 590 

Val Thr Ser Arg 5 Ala Ser Asn Ser Ala Val Pro Pro Ala Thr Val Glu 

595 600 605 

Ala Phe Val Glu Arg Asp Ser Leu His Phe Pro His Pro Val Met lie 

610 615 620 

Tyr Ala Asn Val Lys Gin Gly Phe Tyr Pro He Leu Asn Ala Thr Val 
625 630 635 640 

Thr Ala Thr Val Glu Pro Glu Thr Gly Asp Pro Val Thr Leu Arg Leu 

645 650 655 

Leu Asp Asp Gly Ala Gly Ala Asp Val He Lys Asn Asp Gly He Tyr 

660 665 670 

Ser Arg Tyr Phe Phe Ser Phe Ala Ala Asn Gly Arg Tyr Ser Leu Lys 

675 680 685 

Val His Val Asn His Ser Pro Ser He Ser Thr Pro Ala His Ser He 

690 695 700 

Pro Gly Ser His Ala Met Tyr Val Pro Gly Tyr Thr Ala Asn Gly Asn 
705 710 715 720 

He Gin Met Asn Ala Pro Arg Lys Ser Val Gly Arg Asn Glu Glu Glu 

725 730 735 

Arg Lys Trp Gly Phe Ser Arg Val Ser Ser Gly Gly Ser Phe Ser Val 

740 745 750 

Leu Gly Val Pro Ala Gly Pro His Pro Asp Val Phe Pro Pro Cys Lys 

755 760 765 

He He Asp Leu Glu Ala Val Lys Val Glu Glu Glu Leu Thr Leu Ser 

770 775 780 

Trp Thr Ala Pro Gly Glu Asp Phe Asp Gin Gly Gin Ala Thr Ser Tyr 
785 790 795 800 

Glu He Arg Met Ser Lys Ser Leu Gin Asn He Gin Asp Asp Phe Asn 

805 810 815 

Asn Ala He Leu Val Asn Thr Ser Lys Arg Asn Pro Gin Gin Ala Gly 

820 825 830 

He Arg Glu He Phe Thr Phe Ser Pro Gin He Ser Thr Asn Gly Pro 

835 840 845 

Glu His Gin Pro Asn Gly Glu Thr His Glu Ser His Arg He Tyr Val 

850 855 860 

Ala He Arg Ala Met Asp Arg Asn Ser Leu Gin Ser Ala Val Ser Asn 
865 870 875 880 

He Ala Gin Ala Pro Leu Phe He Pro Pro Asn Ser Asp Pro Val Pro 

885 890 895 

Ala Arg Asp Tyr Leu He Leu Lys Gly Val Leu Thr Ala Met Gly Leu 

900 905 910 

He Gly He He Cys Leu He He Val Val Thr His His Thr Leu Ser 

915 920 925 

Arg Lys Lys Arg Ala Asp Lys Lys Glu Asn Gly Thr Lys Leu Leu 
930 935 940 



5 



<210> 3 

<211> 785 

<212> DNA 

<213> Homo sapiens 

<400> 3 

tctgattccg cgactccttg gccgccgctg cgcatggaaa gctctgccaa gatggagagc 60 
ggcggcgccg gccagcagcc ccagccgcag ccccagcagc ccttcctgcc gcccgcagcc 120 
tgtttctttg ccacggccgc agccgcggcg gccgcagccg ccgcagcggc agcgcagagc 180 
gcgcagcagc agcagcagca gcagcagcag caggcgccgc agctgagacc ggcggccgac 240 
ggccagccct cagggggcgg tcacaagtca gcgcccaagc aagtcaagcg acagcgctcg 300 
tcttcgcccg aactgatgcg ctgcaaacgc cggctcaact tcagcggctt tggctacagc 3 60 
ctgccgcagc agcagccggc cgccgtggcg cgccgcaacg agcgcgagcg caaccgcgtc 420 
aagttggtca acctgggctt tgccaccctt cgggagcacg tccccaacgg cgcggccaac 480 
aagaagatga gtaaggtgga gacactgcgc tcggcggtcg, agtacatccg cgcgctgcag 540 
cagctgctgg acgagcatga cgcggtgagc gccgccttcc aggcaggcgt cctgtcgccc 600 
accatctccc ccaactactc caacgacttg aactccatgg ccggctcgcc ggtctcatcc 660 
tactcgtcgg acgagggctc ttacgacccg ctcagccccg aggagcagga gcttctcgac 720 
ttcaccaact ggttctgagg ggctcggcct ggtcaggccc tggtgcgaat ggactttgga 780 
agcag 785 

<210> 4 
<211> 236 
<212> PRT 

<213> Homo sapiens 4 



<400> 4 



Met 


Glu 


Ser 


Ser 


Ala 


Lys 


Met 


Glu 


Ser 


Gly 


Gly Ala 


Gly 


Gin 


Gin 


Pro 


1 








5 










10 








15 




Gin 


Pro 


Gin 


Pro 
20 


Gin 


Gin 


Pro 


Phe 


Leu 
25 


Pro 


Pro Ala 


Ala 


Cys 
30 


Phe 


Phe 


Ala 


Thr 


Ala 
35 


Ala 


Ala 


Ala 


Ala 


Ala 
40 


Ala 


Ala 


Ala Ala 


Ala 
45 


Ala 


Ala 


Gin 


Ser 


Ala 
50 


Gin 


Gin 


Gin 


Gin 


Gin 
55 


Gin 


Gin 


Gin 


Gin Gin 
60 


Gin 


Gin 


Ala 


Pro 


Gin 


Leu 


Arg 


Pro 


Ala 


Ala 


Asp 


Gly 


Gin 


Pro 


Ser Gly 


Gly 


Gly His Lys 


65 










70 










75 








80 


Ser 


Ala 


Pro 


Lys 


Gin 
85 


Val 


Lys 


Arg 


Gin 


Arg 
90 


Ser Ser 


Ser 


Pro 


Glu 
95 


Leu 


Met 


Arg 


Cys 


Lys 
100 


Arg 


Arg 


Leu 


Asn 


Phe 
105 


Ser 


Gly Phe 


Gly 


Tyr 
110 


Ser 


Leu 


Pro 


Gin 


Gin 


Gin 


Pro 


Ala 


Ala 


Val 


Ala 


Arg 


Arg Asn 


Glu 


Arg Glu Arg 






115 










120 








125 








Asn 


Arg 


Val 


Lys 


Leu 


Val 


Asn 


Leu 


Gly 


Phe 


Ala Thr 


Leu 


Arg Glu His 




130 










135 








140 










Val 


Pro 


Asn 


Gly 


Ala 


Ala 


Asn 


Lys 


Lys 


Met 


Ser Lys 


Val 


Glu 


Thr 


Leu 


145 










150 










155 








160 


Arg 


Ser 


Ala 


Val 


Glu 
165 


Tyr 


He 


Arg 


Ala 


Leu 
170 


Gin Gin 


Leu 


Leu 


Asp 
175 


Glu 


His 


Asp 


Ala 


Val 


Ser 


Ala 


Ala 


Phe 


Gin 


Ala 


Gly Val 


Leu 


Ser 


Pro 


Thr 


He 






180 










185 








190 






Ser 


Pro 


Asn 


Tyr 


Ser 


Asn 


Asp 


Leu 


Asn 


Ser Met 


Ala 


Gly 


Ser 


Pro 






195 










200 








205 






Val 


Ser 


Ser 


Tyr 


Ser 


Ser 


Asp 


Glu 


Gly 


Ser 


Tyr Asp 


Pro 


Leu 


Ser 


Pro 



6 



210 215 220 

Glu Glu Gin Glu Leu Leu Asp Phe Thr Asn Trp Phe 
225 230 235 



<210> 5 

<211> 1633 

<212> DNA 

<213> Homo sapiens 

<400> 5 

cgtggaggca gctagcgcga ggctggggag 
ccagactagc gaacaataca gtcgggatgg 
agacgtccgc ttatgccttc tttgtgcaga 
cagaggtccc tgtcaatttt gcggaatttt 
tgtccgggaa agagaaatcc aaatttgatg 
atcgggaaat gaaggattat ggaccagcta 
ctcccaaaag gccaccgtct ggattcttcc 
aatccacaaa ccccggcatc tctattggag 
ataatttaaa tgacagtgaa aagcagcctt 
agtatgagaa ggatgttgct gactataagt 
ctgctaaagt tgcccggaaa aaggtggaag 
aggaggagga ggaggaggag gatgaataaa 
ttagagtagg ggagcgccgt aattgacaca 
attaggttta attacaaaat ttgatcacga 
aattgtcagt ggtttacatg aagtggccat 
aagttgtaca tatttccaaa catttttaaa 
ctgtgcactt tgctgttggt gtgacaaggc 
atttgtaagg tggtggtaac tatggttatt 
tatctatagt ttgtaaaaag aacaaaacaa 
gcgttgaggc tgtggggaag atgccttttg 
gaggctggac ctgttgactc tgcagggggc 
gtatatagtg acatagcatt ctgctgccat 
catgagaata ttttttttta agtgcggtag 
tagaactctt cattgtcagc aaagcaaaga 
cctgtactta aacacgattc gcaacgttct 
aatgtttttg aagttaaata aacagtatta 
tcaatttctg actcacagca gtgaacaaac 
ccctataaat gtg 

<210> 6 

<211> 200 

<212> PRT 

<213> Homo sapiens 



cgctgagccg cgcgtcgtgc cctgcgctgc 60 
ctaaaggtga ccccaagaaa ccaaagggca 120 
catgcagaga agaacataag aagaaaaacc 180 
ccaagaagtg ctctgagagg tggaagacgg 240 
aaatggcaaa ggcagataaa gtgcgctatg 300 
agggaggcaa gaagaagaag gatcctaatg 360 
tgttctgttc agaattccgc cccaagatca 420 
acgtggcaaa aaagctgggt • gagatgtgga 480 
acatcactaa ggcggcaaag ctgaaggaga 540 
cgaaaggaaa gtttgatggt gcaaagggtc 600 
aggaagatga agaacaggag gaggaagaag 660 
gaaactgttt atctgtctcc ttgtgaatac 720 
tctcttattt gagaagtgtc tgttgccctc 780 
tcatattgta gtctctcaaa gtgctctaga 840 
gggtgtctgg agcaccctga aactgtatca 900 
atgaaaaggc actctcgtgt tctcctcact 960 
atttaaagat gtttctggca ttttcttttt 1020 
ggctagaaat cctgagtttt caactgtata 1080 
ccgagacaaa cccttgatgc tccttgctcg 1140 
ggagaggctg tagctcaggg cgtgcactgt -1200 
atccatttag cttcaggttg tcttgtttct 1260 
cttagctgtg gacaaagggg ggtcagctgg 1320 
tttttaaact gtttgttttt aaacaaacta 1380 
gtcactgcat caatgaaagt tcaagaacct 1440 
gttatttttt ttgtatgttt agaatgctga 1500 
catttttaga actcttctct actataacag 1560 
ccccactccg ttgtatttgg agactggcct 1620 

1633 



<40.0> 6 

Met Ala Lys Gly Asp Pro Lys Lys 

1 5 
Ala Phe Phe Val Gin Thr Cys Arg 
20 

Glu Val Pro Val Asn Phe Ala Glu 

35 40 
Trp Lys Thr Met Ser Gly Lys Glu 

50 55 
Lys Ala Asp Lys Val Arg Tyr Asp 



Pro Lys Gly Lys Met Ser Ala Tyr 

10 15 
Glu Glu His Lys Lys Lys Asn Pro 
25 30 
Phe Ser Lys Lys Cys Ser Glu Arg 
45 

Lys Ser Lys Phe Asp Glu Met Ala 
60 

Arg Glu Met Lys Asp Tyr Gly Pro 



7 



I 



65 70 75 80 

Ala Lys Gly Gly Lys Lys Lys Lys Asp Pro Asn Ala Pro Lys Arg Pro 

85 90 95 

Pro Ser Gly Phe Phe Leu Phe Cys Ser Glu Phe Arg Pro Lys lie Lys 

100 105 110 

Ser Thr Asn Pro Gly lie Ser He Gly Asp Val Ala Lys Lys Leu Gly 

115 120 125 

Glu Met Trp Asn Asn Leu Asn Asp Ser Glu Lys Gin Pro Tyr He Thr 

130 135 140 

Lys Ala Ala Lys Leu Lys Glu Lys Tyr Glu Lys Asp Val Ala Asp Tyr 
145 150 155 160 

Lys Ser Lys Gly Lys Phe Asp Gly Ala Lys Gly Pro Ala Lys Val Ala 

165 170 175 

Arg Lys Lys Val Glu Glu Glu Asp Glu Glu Glu Glu Glu Glu Glu Glu 

180 185 190 

Glu Glu Glu Glu Glu Glu Asp Glu 
i95 200 



<210> 7 

<211> 781 

<212> DNA 

<213> Homo sapiens 

<400> 7 

gcggcggagc tgtgagccgg cgactcgggt 
gagacacggc gggtaggtcc acaggcagat 
gaagaggaac cagcaggctt ccggagggtt 
ctcgaagtcg : tcgtccctct catgcggtgc 
gccataacta gggaggaagg agggccgagg 
tgttgggggt atccgagtcc cagaagcacc 
cagacgggac' caggagaggg acggcatgag 
agtcccagga gcccagtaat ggagagcccc 
atcctacacc tgggcagcag acagaagaag 
acatggaagg tgatctgcaa gagctgcatc 
ggttccggcg tcaaggtgaa gataatacct 
ggtgaagagc aaccacaagt ttaaatgaag 
attagatatt tgacttaaac tatctcaata 
a 

<210> 8 

<211> 160 

<212> PRT 

<213> Homo sapiens 



ccctgaggtc tggattcttt ctccgctact 60 
ccaactggga gttgaagtgt gagtgagagt 120 
gtgtggtcag tgactcagag tgagaaggcc 180 
cacgcccatg gaccttcttg tctcgtcacg 240 
agtggagggg ctcaggcgaa gctggggtgc 300 
tggaaccccg acagaagatt ctggactccc 360 
cgacacacac aaacacagaa ccacacagcc 420 
aaaaagaaga accagcagct gaaagtcggg 480 
atcaggatac agctgagatc ccagtgcgcg 540 
agtcaaacac cggggataaa tctggatttg 600 
aaagaggaac actgtaaaat gccagaagca 660 
acaagctgaa acaacgcaag ctggttttat 720 
aagttttgca gctttcacca aaaaaaaaaa 780 

781 



<400> 8 

Met Arg Cys His Ala His Gly Pro Ser Cys Leu Val Thr Ala He Thr 

15 10 15 

Arg Glu Glu Gly Gly Pro Arg Ser Gly Gly Ala Gin Ala Lys Leu Gly 

20 25 30 

Cys Cys Trp Gly Tyr Pro Ser Pro Arg Ser Thr Trp Asn Pro Asp Arg 

35 40 45 

Arg Phe Trp Thr Pro Gin Thr Gly Pro Gly Glu Gly Arg His Glu Arg 

50 55 60 

His Thr Gin Thr Gin Asn His Thr Ala Ser Pro Arg Ser Pro Val Met 
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65 70 75 . 80 

Glu Ser Pro Lys Lys Lys Asn Gin Gin ieu Lys Val Gly He Leu His 

85 90 95 

Leu Gly Ser Arg Gin Lys Lys He Arg He Gin Leu Arg Ser Gin Cys 

100 105 HO 

Ala Thr Tip Lys Val He Cys Lys Ser Cys He Ser Gin Thr Pro Gly 

115 120 125 

He Asn Leu Asp Leu Gly Ser Gly Val Lys Val Lys He He Pro Lys 

130 135 140 

Glu Glu His Cys Lys Met Pro Glu Ala Gly Glu Glu Gin Pro Gin Val 
145 " 150 155 160 



<210> 9 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 9 

gacggcatga gcgacacaca 20 

<210> 10 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 10 

ccatgtcgcg cactgggatc 20 

<210> 11 

<211> 28 

<212> DNA 

<213> Homo sapiens 

<400> 11 

ctgaaagtcg ggatcctaca cctgggca 28 

<210> 12 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 12 

ggccaccgtc tggattcttc 20 

<210> 13 

<211> 20 

<212> DNA 

<213> Homo sapiens 

<400> 13 

gaagaatcca gacggtggcc 20 



<210> 14 



s 

I 
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<211> 26 

<212> DNA 

<213> Homo sapiens 



<400> 14 

ccgccccaag atcaaatcca caaacc 

<210> 15 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 15 

atggcagagg ctgacagact c 

<210> 16 

<211> 25 

<212> DNA 

<213> Homo sapiens 

<400> 16 

ttcaaccacc tcaaatcctt tctta 



26 



21 



25 



<210> 17 • 

<211> 27 

<212> DNA 

<213> Homo sapiens 

<400> 17 

tcgacagcaa aggagagatc agagccc 

<210> 18 

<211> 18 

<212> DNA 

<213> Homo sapiens 



27 



<400> 18 

ttacgacccg ctcagccc 



18 



<210> 19 

<211> 19 

<212> DNA 

<213> Homo sapiens 

<400> 19 

ctcccaacgc cactgacaa 

<210> 20 

<211> 22 

<212> DNA 

<213> Homo sapiens 



19 



<400> 20 

ccaggccgag cccctcagaa cc 



22 



10 



<210> 21 

<211> 1800 

<212> DNA 

<213> Homo sapiens 

<400> 21 

gcgcctcatt gccactgcag tgactaaagc tgggaagacg ctggtcagtt cacctgcccc 60 
actggttgtt ttttaaacaa attctgatac aggcgacatc ctcactgacc gagcaaagat 120 
tgacattcgt atcatcactg tgcaccattg gcttctaggc actccagtgg ggtaggagaa 180 
ggaggtctga aaccctcgca gagggatctt gccctcattc tttgggtctg aaacactggc 240 
agtcgttgga aacaggactc agggataaac cagcgcaatg gattggggga cgctgcacac 300 
tttcatcggg ggtgtcaaca aacactccac cagcatcggg aaggtgtgga tcacagtcat 360 
ctttattttc cgagtcatga tcctagtggt ggctgcccag gaagtgtggg gtgacgagca 420 
agaggacttc gtctgcaaca cactgcaacc gggatgcaaa aatgtgtgct atgaccactt 480 
tttcccggtg tcccacatcc ggctgtgggc cctccagctg atcttcgtct ccaccccagc 540 
gctgctggtg gccatgcatg tggcctacta caggcacgaa accactcgca agttcaggcg 600 
aggagagaag aggaatgatt tcaaagacat agaggacatt aaaaagcaca aggttcggat 660 
agaggggtcg ctgtggtgga cgtacaccag cagcatcttt ttccgaatca tctttgaagc 720 
agcctttatg tatgtgtttt acttccttta caatgggtac cacctgccct gggtgttgaa 780 
atgtgggatt gacccctgcc ccaaccttgt tgactgcttt atttctaggc caacagagaa 840 
gaccgtgttt accattttta tgatttctgc gtctgtgatt tgcatgctgc ttaacgtggc 900 
agagttgtgc tacctgctgc tgaaagtgtg ttttaggaga tcaaagagag cacagacgca 960 
aaaaaatcac cccaatcatg ccctaaagga gagtaagcag aatgaaatga atgagctgat 1020 
ttcagatagt ggtcaaaatg caatcacagg tttcccaagc taaacatttc aaggtaaaat 1080 
gtagctgcgt cataaggaga cttctgtctt ctccagaagg caataccaac ctgaaagttc 1140 
cttctgtagc ctgaagagtt tgtaaatgac tttcataata aatagacact tgagttaact 1200 
ttttgtagga tacttgctcc attcatacac aacgtaatca aatatgtggt ccatctctga 1260 
aaacaagaga ctgcttgaca aaggagcatt gcagtcactt tgacaggttc cttttaagtg 1320 
gactctctga caaagtgggt actttctgaa aatttatata actgttgttg ataaggaaca 1380 
tttatccagg aattgatacg tttattagga aaagatattt ttataggctt ggatgttttt 1440 
agttccgact ttgaatttat ataaagtatt tttataatga ctggtcttcc ttacctggaa 1500 
aaacatgcga tgttagtttt agaattacac cacaagtatc taaatttcca acttacaaag 1560 
ggtcctatct tgtaaatatt gttttgcatt gtctgttggc aaatttgtga actgtcatga 1620 
tacgcttaag gtgggaaagt gttcattgca caatatattt ttactgcttt ctgaatgtag 1680 
acggaacagt gtggaagcag aaggcttttt taactcatcc gtttggccga tcgttgcaga 1740 
ccactgggag atgtggatgt ggttgcctcc ttttgctcgt ccccgtggct taacccttct 1800 

<210> 22 
<211> 261 
<212> PRT 

<213> Homo sapiens 
<400> 22 

Met Asp Trp Gly Thr Leu His Thr Phe lie Gly Gly Val Asn Lys His 

15 10 15 

Ser Thr Ser lie Gly Lys Val Trp He Thr Val He Phe He Phe Arg 

20 25 30 

Val Met lie Leu Val Val Ala Ala Gin Glu Val Trp Gly Asp Glu Gin 

35 40 45 

Glu Asp Phe Val Cys Asn Thr Leu Gin Pro Gly Cys Lys Asn Val Cys 

50 55 60 

Tyr Asp His Phe Phe Pro Val Ser His He Arg Leu Trp Ala Leu Gin 
65 70 75 80 
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Leu lie Phe Val Ser Thr Pro Ala Leu Leu Val Ala Met His Val Ala 

85 90 95 

Tyr Tyr Arg His Glu Thr Thr Arg Lys Phe Arg Arg Gly Glu Lys Arg 

100 105 110 

Asn Asp Phe Lys Asp lie Glu Asp lie Lys Lys His Lys Val Arg lie 

115 120 125 

Glu Gly Ser Leu Trp Trp Thr Tyr Thr Ser Ser lie Phe Phe Arg lie 

130 135 140 

lie Phe Glu Ala Ala Phe Met Tyr Val Phe Tyr Phe Leu Tyr Asn Gly 
145 150 155 160 

Tyr His Leu Pro Trp Val Leu Lys Cys Gly He Asp Pro Cys Pro Asn / 

165 170 175 

Leu Val Asp Cys Phe He Ser Arg Pro Thr Glu Lys Thr Val Phe Thr 

180 185 190 

He Phe Met He Ser Ala Ser Val He Cys Met Leu Leu Asn Val Ala 

195 200 205 

Glu Leu Cys Tyr Leu Leu Leu Lys Val Cys Phe Arg Arg Ser Lys Arg 

210 215 220 

Ala Gin Thr Gin Lys Asn His Pro Asn His Ala Leu Lys Glu Ser Lys 
225 230 235 240 

Gin Asn Glu Met Asn Glu Leu He Ser Asp Ser Gly Gin Asn Ala He 

245 250 255 

Thr Gly Phe Pro Ser 
260 

<210> 23 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 23 

attccaggcg acatcctcac t 21 



<210> 24 

<211> 24 

<212> DMA 

<213> Homo sapiens 

<400> 24 

gtttatccct gagtcctgtt tcca 24 



<210> 25 

<211> 27 

<212> DNA 

<213> Homo sapiens 

<400> 25 

tgtgcaccat tggcttctag gcactcc 27 

<210> 26 
<211> 2257 
<212> DNA 
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<213> Homo sapiens 
<400> 26 

attttgctta cagagtcccg tctcaccatc ctgggcttcc aacggagact gcggtatccg 60 
cggctggaga cccagcggcg agtagccttt tgctcccgga cggacttgag aggcttaaag 120 
gatggcctcg tcagatctgg aacaattatg ctctcatgtt aatgaaaaga ttggcaatat 180 
taagaaaacc ttatcattaa gaaactgtgg ccaggaacct accttgaaaa ctgtattaaa 240 
taaaatagga gatgagatca ttgtaataaa tgaacttcta aataaattgg aattggaaat 300 
tcagtatcaa gaacaaacca acaattcact caaggaactc tgtgaatctc ttgaagaaga 360 
ttacaaagac atagaacatc ttaaagaaaa cgttccttcc catttgcctc aagtaacagt 420 
aacccagagc tgtgttaagg gatcagatct tgatcctgaa gaaccaatca aagttgaaga 480 
acctgaaccc gtaaagaagc ctcccaaaga gcaaagaagt attaaggaaa tgccatttat 540 
aacttgtgat gagttcaatg gtgttccttc gtacatgaaa tcccgcttaa cctataatca 600 
aattaatgat gttattaaag aaatcaacaa ggcagtaatt agtaaatata aaatcctaca 660 
tcagccaaaa aagtctatga attctgtgac cagaaatctc tatcacagat ttattgatga 720 
agaaacgaag gataccaaag gtcgttattt tatagtggaa gctgacataa aggagttcac 780 
aactttgaaa gctgacaaga agtttcacgt gttactgaat attttacgac actgccggag 840 
gctatcagag gtccgagggg gaggacttac tcgttatgtt ataacctgag tcccttgtga 900 
acttttgaac ataccaacag ggtatagagt atagaggcta tttctataat tttcttatat 960 
ataatttttt taacttttaa tcttttttgt ttcctttttt ttttttttga gacaggatct 1020 
tgctttgtca cccaggggct tgctttgtca cgcaggctag agtgcagtgg cgcaaacatg 1080 
gctcactgca gcctcaacct cccaggctca agtgatcctc ccacctcagc cccctgaatg 1140 
gctgggacta caagcgtgcg ccaccatgcc tggctaattt ttgtattttt tggagagatg 1200 
gggtttcacc atgttgccta ggctggtctt gagctcctga gctcaaacaa tccaccctcc 1260 
tcagcctccc aaagtgctgg gattacaggc ttgagccacc acacctgacc tattcttgtt 1320 
tcttataaaa ataaaacttt tttggataaa gcttatttct tgtttttttc tttttctttt 1380 
tttttttttt tcgagactcc atctcagaaa aaaagaaaaa aagactgggt acagatgtga 1440 
tattggaaga aaaagatcaa gctgatgagg ttaggatacc caggcccttt ggacttaaag 1500 
atcactagtg tctaaattcc atcgatggca tttcagtcta taggtaaact tcctggaagc 1560 
tggatttgga gacagtttat catctgatta ttgggctttc gtataggtcc ttagggagca 1620 
gcttacctga aatgcattta gtgtacacca gtctgtaaac ttcaacctgt aatgaaagtg 1680 
taataaatgt acattgagtt gatgtgataa tgtgatataa taagaaatat atatttgatc 1740 
ttcctatcta gttccttgtt cagagctcct aaaacccttg taatttccaa agtgatggag 1800 
tacatctttt gttctagtat ttggtctttg accccagttc ctgacacaaa gctcctaaat 1860 
tcctttaaat ttcccagtga taggagaatt ttttgttcta atgaggtcac tcttgatggg 1920 
cacctggata actcaggatg ggggctgctc acaaagacca catcatgatt ggaagtttca 1980 
aactttcagt ctcccacctc cagagagggg agaggggctg gagatttgtg tcaataatcc 2040 
atcaggccta tgtcaacaag acataatccg ttaactatgg agttcaggga gcttcagggt 2100 
tggcaaacat tttgatgtgc caggaaggtg acgcactcca gctttatgaa gtcagcaagt 2160 
cctgtgctca ggatgcttyt ggaccttgcc ccaggtaccc cttcatgtgg ctgttgttca 2220 
tctgtatcct ttgtagtagc cttaaaataa actgtta 2257 



<210> 27 

<211> 255 

<212> PRT 

' <213> Homo sapiens 

<400> 27 

Met Ala Ser Ser Asp Leu Glu Gin Leu Cys Ser His Val Asn Glu Lys 

15 10 15 

lie Gly Asn lie Lys Lys Thr Leu Ser Leu Arg Asn Cys Gly Gin Glu 

20 25 30 

Pro Thr Leu Lys Thr Val Leu Asn Lys lie Gly Asp Glu lie He Val 
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35 40 45 



He 


Asn 


Glu 


Leu 


Leu 


Asn 


Lys 


Leu 


Glu 


Leu 


Glu He 


Gin 


Tyr 


Gin Glu 




50 










55 








60 






Gin 


Thr 


Asn 


Asn 


Ser 


Leu 


Lys 


Glu 


Leu 


Cys 


Glu Ser 


Leu 


Glu 


Glu Asp 


65 










70 










75 






80 


Tyr 


Lys 


Asp 


He 


Glu 
85 


His 


Leu 


Lys 


Glu 


Asn 
90 


Val Pro 


Ser 


His 


Leu Pro 
95 


Gin 


Val 


Thr 


Val 


Thr 


Gin 


Ser 


Cys 


Val 


Lys 


Gly Ser Asp 


Leu 


Asp Pro 








100 










105 








110 




Glu 


Glu 


Pro 
115 


He 


Lys 


Val 


Glu 


Glu 
120 


Pro 


Glu 


Pro Val 


Lys 
125 


Lys 


Pro Pro 


Lys 


Glu 
130 


Gin 


Arg 


Ser 


He 


Lys 
135 


Glu 


Met 


Pro 


Phe He 
140 


Thr 


Cys 


Asp Glu 


Phe 


Asn 


Gly Val 


Pro 


Ser 


Tyr Met 


Lys 


Ser Arg Leu 


Thr 


Tyr Asn Gin 


145 










150 










155 






160 


He 


Asn 


Asp Val 


He 


Lys 


Glu 


He 


Asn 


Lys 


Ala Val 


He 


Ser 


Lys Tyr 










165 










170 








175 


Lys 


He 


Leu 


His 


Gin 


Pro 


Lys 


Lys 


Ser 


Met 


Asn Ser 


Val 


Thr Arg Asn 








180 










185 








190 




Leu 


Tyr 


His 


Arg 


Phe 


He Asp.Glu Glu 


Thr 


Lys Asp 


Thr 


Lys 


Gly Arg 






195 










200 








205 






Tyr 


Phe 


He 


Val 


Glu 


Ala' Asp 


He 


Lys 


Glu 


Phe Thr 


Thr 


Leu 


Lys Ala 




210 










215 








220 








Asp 


Lys 


Lys 


Phe 


His 


Val 


Leu 


Leu 


Asn 


He 


Leu Arg 


His 


Cys 


Arg Arg 


225 










230 v 










235 






240 


Leu 


Ser 


Glu 


Val 


Arg 


Gly Gly Gly Leu 


Thr Arg Tyr Val 


He 


Thr 










245 










250 








255 



<210> 28 
<211> 22 
<212> DNA 

<213> Homo sapiens 
<400> 28 

cccagagctg tgttaaggga tc 



<210> 29 
<211> 23 
<212> DNA 

<213> Homo sapiens 



<400> 29 

gttaagcggg atttcatgta cga 



22 



23 



<210> 30 

<211> 28 

<212> DNA 

<213> Homo sapiens 

<400> 30 

agaacctgaa cccgtaaaga agcctccc 28 
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<210> 31 
<211> 1740 
<212> DNA 

<213> Homo sapiens 
<400> 31 

atgaacaaac tgtatatcgg aaacctcagc gagaacgccg ccccctcgga cctagaaagt 60 
atcttcaagg acgccaagat cccggtgtcg ggacccttcc tggtgaagac tggctacgcg 120 
ttcgtggact gcccggacga gagctgggcc ctcaaggcca tcgaggcgct ttcaggtaaa 180 
atagaactgc acgggaaacc catagaagtt gagcactcgg tcccaaaaag gcaaaggatt 240 
cggaaacttc agatacgaaa tatcccgcct catttacagt gggaggtgct ggatagttta 300 
ctagtccagt atggagtggt ggagagctgt gagcaagtga acactgactc ggaaactgca 360 
gttgtaaatg taacctattc cagtaaggac caagctagac aagcactaga caaactgaat 420 
ggatttcagt tagagaattt caccttgaaa gtagcctata tccctgatga aacggccgcc 480 
cagcaaaacc ccttgcagca gccccgaggt cgccgggggc ttgggcagag gggctcctca 540 
aggcaggggt ctccaggatc cgtatccaag cagaaaccat gtgatttgcc tctgcgcctg 600 
ctggttccca cccaatttgt tggagccatc ataggaaaag aaggtgccac cattcggaac 660 
atcaccaaac agacccagtc taaaatcgat gtccaccgta aagaaaatgc gggggctgct 720 
gagaagtcga ttactatcct ctctactcct gaaggcacct ctgcggcttg taagtctatt 780 
ctggagatta tgcataagga agctcaagat ataaaattca cagaagagat ccccttgaag 840 
attttagctc ataataactt tgttggacgt cttattggta aagaaggaag aaatcttaaa 900 
aaaattgagc aagacacaga cactaaaatc acgatatctc cattgcagga attgacgctg 960 
tataatccag aacgcactat tacagttaaa ggcaatgttg agacatgtgc caaagctgag 1020 
gaggagatca tgaagaaaat cagggagtct tatgaaaatg atattgcttc tatgaatctt 1080 
caagcacatt taattcctgg attaaatctg aacgccttgg gtctgttccc acccacttca 1140 
gggatgccac ctcccacctc agggccccct tcagccatga ctcctcccta cccgcagttt 1200 
gagcaatcag aaacggagac tgttcatctg tttatcccag ctctatcagt cggtgccatc 1260 
atcggcaagc agggccagca catcaagcag ctttctcgct ttgctggagc ttcaattaag 1320 
attgctccag cggaagcacc agatgctaaa gtgaggatgg tgattatcac tggaccacca 1380 
gaggctcagt tcaaggctca gggaagaatt tatggaaaaa ttaaagaaga aaactttgtt 1440 
agtcctaaag aagaggtgaa acttgaagct catatcagag tgccatcctt tgctgctggc 1500 
agagttattg gaaaaggagg caaaacggtg aatgaacttc agaatttgtc aagtgcagaa 1560 
gttgttgtcc ctcgtgacca gacacctgat gagaatgacc aagtggttgt caaaataact 1620 
ggtcacttct atgcttgcca ggttgcccag agaaaaattc aggaaattct gactcaggta 168 0 
aagcagcacc aacaacagaa ggctctgcaa agtggaccac ctcagtcaag acggaagtaa 1740 



<210> 32 
<211> 579 
<212> PRT 

<213> Homo sapiens 
<400> 32 

Met Asn Lys Leu Tyr lie Gly Asn Leu Ser Glu Asn Ala Ala Pro Ser 

15 io 15 

Asp Leu Glu Ser lie Phe Lys Asp Ala Lys lie Pro Val Ser Gly Pro 

20 25 30 

Phe Leu Val Lys Thr Gly Tyr Ala Phe Val Asp Cys Pro Asp Glu Ser 

35 40 45 

Trp Ala Leu Lys Ala He Glu Ala Leu Ser Gly Lys He Glu Leu His 

50 55 60 

Gly Lys Pro He Glu Val Glu His Ser Val Pro Lys Arg Gin Arg He 



15 



65 70 ' 75 80 

Arg Lys Leu Gin He Arg Asn He Pro Pro His Leu Gin Trp Glu Val 

85 90 95 

Leu Asp Ser Leu Leu Val Gin Tyr Gly Val Val Glu Ser Cys Glu Gin 

100 105 no 

Val Asn Thr Asp Ser Glu Thr Ala Val Val Asn Val Thr Tyr Ser Ser 

115 120 125 

Lys Asp Gin Ala Arg Gin Ala Leu Asp Lys Leu Asn Gly Phe Gin Leu 

130 135 140 

Glu Asn Phe Thr Leu Lys Val Ala Tyr He Pro Asp Glu Thr Ala Ala 
145 15 0 155 leo 

Gin Gin Asn Pro Leu Gin Gin Pro Arg Gly Arg Arg Gly Leu Gly 'Gin 

165 170 175 

Arg Gly Ser Ser Arg Gin Gly Ser Pro Gly Ser Val Ser Lys Gin Lys 

180 185 190 

Pro Cys Asp Leu Pro Leu Arg Leu Leu Val Pro Thr Gin Phe Val Glv 

195 200 205 

Ala He He Gly Lys Glu Gly Ala Thr He Arg Asn He Thr Lys Gin 

21° 215 220 

Thr Gin Ser Lys He Asp Val His Arg Lys Glu Asn Ala Gly Ala Ala 
225 230 235 240 

Glu Lys Ser He Thr He Leu Ser Thr Pro Glu Gly Thr Ser Ala Ala 

245 250 255 

Cys Lys Ser He Leu Glu He Met His Lys Glu Ala Gin Asp He Lys 

260 265 270 

Phe Thr Glu Glu He Pro Leu Lys He Leu Ala His Asn Asn Phe Val 

275 280 285 

Gly Arg Leu He Gly Lys Glu Gly Arg Asn Leu Lys Lys He Glu Gin 

290 295 300 

Asp Thr Asp Thr Lys He Thr He Ser Pro Leu Gin Glu Leu Thr Leu 
305 31 0 315 320 

Tyr Asn Pro Glu Arg Thr He Thr Val Lys Gly Asn Val Glu Thr Cys 

325 330 335 

Ala Lys Ala Glu Glu Glu He Met Lys Lys He Arg Glu Ser Tyr Glu 

340 345 350 

Asn Asp He Ala Ser Met Asn Leu Gin Ala His Leu He Pro Gly Leu 

355 360 365 

Asn Leu Asn Ala Leu Gly Leu Phe Pro Pro Thr Ser Gly Met Pro Pro 

370 375 380 

Pro Thr Ser Gly Pro Pro Ser Ala Met Thr Pro Pro Tyr Pro Gin Phe 
385 , 390 395 400 

Glu Gin Ser Glu Thr Glu Thr Val His Leu Phe He Pro Ala Leu Ser 

405 410 415 

Val Gly Ala He He Gly Lys Gin Gly Gin His He Lys Gin Leu Ser 

\ 420 425 430 

Arg Phe Ala Gly Ala Ser He Lys He Ala Pro Ala Glu Ala Pro Asp 
435 440 445 

' Ala Lys Val Arg Met Val He He Thr Gly Pro Pro Glu Ala Gin Phe 
450 455 460 

Lys Ala Gin Gly Arg He Tyr Gly Lys He Lys Glu Glu Asn Phe Val 
465 470 475 480 

Ser Pro Lys Glu Glu Val Lys Leu Glu Ala His He Arg Val Pro Ser 

485 490 495 

Phe Ala Ala Gly Arg Val He Gly Lys Gly Gly Lys Thr Val Asn Glu 
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Leu 


Gin 


Asn 
515 


Leu 


Ser 


Ser 


Ala 


Glu 
520 


Pro 


Asp 
530 


Glu 


Asn 


Asp 


Gin 


Val 
535 


Val 


Ala 


Cys 


Gin 


Val 


Ala 


Gin Arg 


Lys 


545 










550 






Lys 


Gin 


His 


Gin 


Gin 
565 


Gin 


Lys 


Ala 


Arg 


Arg 


Lys 













<210> 33 

<211> 21 

<212> DNA 

<213> Homo sapiens 

<400> 33 

catggactgg ctttctggtt g 

<210> 34 

<211> 24 

<212> DNA 

<213> Homo sapiens 

<400> 34 

ctgagaaaag ctctggcctt aaac 



505 510 
Val Val Val Pro Arg Asp Gin Thr 
525 

Val Lys lie Thr . Gly His Phe Tyr 
540 

lie Gin Glu lie Leu Thr Gin Val 
555 560 
Leu Gin Ser Gly Pro Pro Gin Ser 
570 575 



21 



